Source: cirosantilli/all-github-commit-emails

= All GitHub Commit Emails
{c}
{tag=Open-source intelligence}
{tag=Ciro Santilli's data projects}

https://github.com/cirosantilli/all-github-commit-emails

In this project <Ciro Santilli> extracted (almost) all Git commit emails from <GitHub> with <Google BigQuery>! The repo was later taken down by <GitHub>. Newbs, censoring publicly available data!

Ciro also created a beautifully named variant with one email per commit: https://github.com/cirosantilli/imagine-all-the-people[]. True art. It also had the effect of breaking this "what's my first commit tracker": https://twitter.com/NachoSoto/status/1761873362706698469

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/GitHub_Archive_Google_bigquery_PushEvent_email_highlight.png]
{title=<#GitHub Archive> query showing hashed emails}
{description=It was <Ciro Santilli> that made them hash the emails. They weren't hashed before he published the emails publicly.}
{height=810}

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/All_GitHub_commit_emails_repo_screenshot_before_takedown_archive_is.png]
{title=<All GitHub Commit Emails> repo before takedown}
{description=Screenshot from <archive.is>.}
{height=768}