Massive GitLab scan finds 17,000+ valid secrets in public repositories

Security engineer Luke Marshall scanned over 5.6 million public GitLab repositories using TruffleHog and uncovered 17,430 verified live secrets, including API keys, cloud credentials, and access tokens.

The research is the second part of Marshall's investigation into secret exposure across major Git platforms. His earlier Bitbucket study scanned 2.6 million repositories and uncovered 6,212 valid secrets. In contrast, GitLab not only hosted more than twice the number of public repositories, but also exhibited a 35% higher density of secret leakage per repository.

Founded in 2011, GitLab is a popular Git-based DevOps platform that competes directly with GitHub and Bitbucket. Although the last of the major platforms to launch, GitLab now hosts more than double the number of public repositories as Bitbucket, due in part to its wide adoption in both open-source and enterprise development environments.

To execute the scan, Marshall leveraged GitLab's public API to enumerate all accessible repositories as of October 9, 2025. He then built an automated scanning pipeline using AWS Lambda and SQS. The system avoided redundant scans, supported error recovery, and completed the full scan in just over 24 hours at a cost of $770. Each Lambda invocation executed a custom-built container running TruffleHog to identify only verified secrets, minimizing false positives.

The platform's structure, based on Git version control, makes it vulnerable to secrets being committed deep within project histories. These secrets, unless intentionally rotated or removed, can linger unnoticed for years. Indeed, Marshall discovered one valid secret dating back to December 2009, nearly two years before GitLab even launched, likely due to imported repository history.

Google Cloud Platform credentials were the most frequently exposed, appearing in about 1 in every 1,060 GitLab repositories. The scan also uncovered 406 valid GitLab tokens within GitLab-hosted projects, highlighting a clear pattern of “platform locality,” where developers leak credentials on the same platform they use. Overall, the exposed secrets affected more than 2,800 unique organizations. Marshall used automated tools and custom scripts to contact over 120 of them, directly coordinating with 30+ SaaS vendors to remediate the leaks.

A standout moment from the study involved tracing a Slack token committed by a personal email address to a corporate Slack instance. Using metadata from TruffleHog's analysis mode, Marshall identified the organization via a Slack login page linked to its Okta SSO system. That report was classified as a critical (P1) issue and earned a $2,100 bounty.

Another concerning trend is the persistence of long-lived “zombie secrets.” Some leaked credentials were still valid despite being committed over a decade ago. The study underscores that secrets, once exposed, do not expire automatically; they must be proactively rotated or revoked.

If you liked this article, be sure to follow us on X/Twitter and also LinkedIn for more exclusive content.