Hardcoded GitHub Personal Access Tokens Leak 159 Private Repositories

A repository stores all of your project’s files, as well as the revision history for each one. Repositories can hold anything your project need, including folders and files, photos, videos, spreadsheets, and data sets. You can restrict who has access to a repository by choosing a repository’s visibility: public or private. Everyone on the internet has access to public repositories, whereas, only the repository owner, others they explicitly share access with, and, in the case of organization repositories, specific organization members have access to private repositories. Our researchers using BeVigil, a security search engine, were able to find 159 private GitHub repositories that contained the source code of 10 organizations. The sole reason why we found their private GitHub repositories was because their repositories contained source code of Android apps with hardcoded Github Personal Access Tokens.

Note
We presented the above-mentioned findings at BSides Munich’22, an infosec conference, and you can watch it here for more insights.

BSidesMunich 22 Talk – Large scale research

What is a GitHub Personal Access Token?

When accessing your account on the Git repository hosting service, you can use a Personal Access Token instead of a password. It is possible to produce an unlimited number of access tokens. You can make distinct tokens for each client you want to use to log into your account. You may regulate what degree of access each client has by using a separate token for each, such as whether they have read-only access or can change repositories, by using a separate token for each client. A token can be withdrawn for a specific client without impacting other clients. Although you can limit what actions a given client can do, it will still be able to see all of the repositories to which you have access as a user.

How Did We Find The GitHub Personal Access Token Key?

Whenever a user submits any Android application for scanning, that application gets indexed in the BeVigil search section. The search section will contain all the popular apps that are submitted by users. Certain regexes help us to find the secrets of Android applications. Our security research team using GitHub’s access token regex was able to find the token hardcoded into the application. This means developers embedded these keys right into the source code leaving them vulnerable to attackers. All of the organization’s source code in their private repository which should not be visible to anyone was uncovered after the BeVigil scan. Unfortunately, this vulnerability is not uncommon as this is another instance of passive API security found by the BeVigil team (see our recent Razorpay disclosure).

What is the impact?

Some of the companies had a scope by which an attacker can grant full access to repositories, including private repositories. That includes read/write access to code, commit statuses, repository and organization projects, invitations, collaborators, adding team memberships, deployment statuses, and repository webhooks for repositories and organizations. Also grants the ability to manage user projects.
Anyone can be granted access to delete packages from GitHub packages.

Identifying scopes permitted on the leaked Github PAT

The full list of repositories that we discovered in our analysis can be found below:

Category	Installs	Total Private Repos Leaking
Food and Drink	10,000,000+	26
Lifestyle	100000	13
Food and Drink	50000	6
Health and Fitness	10000	30
Food and Drink	10000	6
Shopping	5000	16
Social	100000	25
Travel and Local	10000	0
Maps & Navigation	100000	0
Business	10000	10
Education	10000	6
Business	500	10
Entertainment	50000	3
Food_And_Drink	100	6
Finance	10000	14

Remediation

The main takeaway from this article should be to not embed your tokens and other sensitive information directly in your code. Listed below are actions developers can take to secure their keys.

Standardized Review Procedures: The first step in hiding your key is to ensure the correct versioning processes. Code pushes are frequently not subjected to a thorough examination. The codebase should be examined, reviewed, and approved for publication before versioning. Key exposure is less likely with standardized procedures. A good automated tool to check your quality code, application security, and technical debt is sonarqube.

Hiding Keys: Moving your tokens outside of the source file structure is another smart way to disguise your tokens. Instead, use a variable to refer to it. A variable in the environment makes it much easier to refer to the same token in different locations, saving time and enhancing security. Also, make sure you do not include the file containing environment variables in your source code. To know more about .env please refer to this blog.
Rotate tokens: As a large proportion of hardcoded tokens are from old codebases, routinely rotating them helps mitigate the risk of leaked tokens; unused tokens are more unlikely to be invalidated to cause any real damage. Vault is a tool for secrets management, encryption as a service, and privileged access management. It can help you manage secrets and protect sensitive data. Another such tool is AWS KMS which has a feature to automatically rotate keys every year.

Security Keys and Secrets: Start hashing and encrypting your key, both in transit and at rest. This should add very little overhead to your interaction times if done correctly, but it ensures that any man-in-the-middle attacks or other breaches are difficult to leverage into larger losses.

The Way Forward

Personal access tokens that haven’t been used in a year are automatically removed by GitHub as a security measure. It is strongly advised to set an expiration date for your personal access tokens for added security. Limiting the number of scopes a token can authorize on its own can help deter some attacks. Many flooding attacks can be avoided by establishing a hard limit on how much can be done in a short amount of time, which can also be used to limit data exfiltration, abusive API usage, and concurrent connections. Before pushing your code to GitHub, ensure that it undergoes vigorous security checks so that no hardcoded secrets can be leaked. Use environment variables whenever you are dealing with sensitive data. Preventing is always better than having to deploy a fix later.

Appendix

CloudSEK’s Responsible Disclosure Policy

We ensure that we do not cause any damage while the detected vulnerability is being investigated. Our investigation must not, in any event, lead to an interruption of services or lead to any details being made public by either the asset manager or its clients.
We do not place a backdoor in an information system in order to then demonstrate the vulnerability, as this can lead to further damage and involves unnecessary security risks.
We do not edit or delete any data from the system and do not introduce any system changes.
We do not try to repeatedly access the system and do not share the access obtained with others.
We do not perform physical testing on any vulnerable devices that we identify.
All the vulnerabilities found during this research were reported to the respective countries’ CERTs (Cyber Emergency Response Team) via proper channels. We also provided significant time for them to respond to these findings before publishing this paper.