Some are named after the encoded protein. Others that are not as clean are just orfXXX for open reading frame XXX.
D'oh.
But to be serious. The Wayback Machine contains a very large proportion of all sites. It does happen sometime that a Wayback Machine archive is missing or broken and cqcounter has the screenshot. But the Wayback Machine is still the most complete database we have found so far. Some archives are very broken. But those are rare.
The only problem with the Wayback Machine is that there is no known efficient way to query its archives across domains. You have to have a domain in hand for CDX queries: Wayback Machine CDX scanning.
The Common Crawl project attempts in part to address this lack of querriability, but we haven't managed to extract any hits from it.
CDX + 2013 DNS Census + heuristics however has been fruitful however.
We have dumped all Wayback Machine archives of known websites to: github.com/cirosantilli/cia-2010-websites-dump using ../cia-2010-covert-communication-websites/download-websites.sh. This allows for better grepping and serves as a backup in case they ever go down.
Their historic DNS and reverse DNS info was very valuable, and served as Ciro's the initial entry point to finding hits in the IP ranges given by Reuters.
Generic information about the website not specific on this project will be stored at: Section "viewdns.info".
Since this source is so scarce and valuable, we have been quite careful to note down all the domain and IP ranges that have been explored.
At news.ycombinator.com/item?id=38496244, the creator of the viewdns.info, "Hughesey", also stated that he'd able to give some free credits for public research projects such as this one. This would have saved up going to quite a few Cafes to get those sweet extra IPs! But it was more fun in hardmode, no doubt.
We do API access to IP ranges with this simple helper: ../cia-2010-covert-communication-websites/viewdns-info.sh, usage:e.g.:
./viewdns-info.sh <apikey> <start-ipv-address> <end-ipv-address>./viewdns-info.sh 8b890b00b17ed2d66bbed878d51200b58d43d014 66.45.179.187 66.45.179.210For domain to IP queries from the API you should use "iphistory" viewdns.info/api/docs/ip-history.php:
curl 'https://api.viewdns.info/iphistory/?domain=todaysengineering.com&apikey=$APIKEY&output=json'Just beware of the viewdns.info reverse IP bug, that really sucks and led to us missing a ton of domains.
14 million images with more than 20k categories, typically denoting prominent objects in the image, either common daily objects, or a wild range of animals. About 1 million of them also have bounding boxes for the objects. The images have different sizes, they are not all standardized to a single size like MNIST[ref].
Each image appears to have a single label associated to it. Care must have been taken somehow with categories, since some images contain severl possible objects, e.g. a person and some object.
Official project page: www.image-net.org/
The data license is restrictive and forbids commercial usage: www.image-net.org/download.php. Also as a result you have to login to download the dataset. Super annoying.
How to visualize: datascience.stackexchange.com/questions/111756/where-can-i-view-the-imagenet-classes-as-a-hierarchy-on-wordnet
Cool data embedded in the Bitcoin blockchain Other blockchains by
Ciro Santilli 40 Updated 2025-07-16
- Namecoin
- nmc.vision/ by x.com/punk3606 is basically the same as this project but for Namecoin: the dude is trying to make database with all namecoin inscriptions ever.
- "Quantum" is an image created by artists Jennifer and Kevin McCoy which Kevin embedded on Namecoin in 2014. As such, it is a relatively early example of inscription. On June 2021 it sold for more than one million dollars at an auction at Sotheby's to NFT collector sillytuna. Bibliography:
- Ethereum
- reidjs.medium.com/top-6-weird-innovative-and-hilarious-findings-in-the-ethereum-blockchain-83dbbca461ca Top 6 Weird, Innovative, and Hilarious findings in the Ethereum Blockchain by Reid Sherman (2018)
- Monero: as of January 2024, Ciro downloaded the blockchain and
strings -n20 -sdidn't seem to have not even a single ASCII art, it is quite sad. Bibliography:
By Ciro Santilli:
- 2021-04-13 twitter.com/cirosantilli/status/1382067162492366854: main initial announcement on Twitter. twitter.com/mikko, who has 209.9K followers and a Wikipedia page: Mikko Hypponen hearted the tweet s2
- 2023-01-21 twitter.com/cirosantilli/status/1749172304259535063: improvements to the Prayer wars
- 2024-02-07 twitter.com/cirosantilli/status/1755378931446739373: large-ish update with new items and improved organization
- 2024-03-31 twitter.com/cirosantilli/status/1774531934305071295: binwalk discoveries, start poking a bit into ordinal ruleset inscriptions
- 2024-04-04 twitter.com/cirosantilli/status/1775805941885108392: largest text ordinal inscription
By others:
- 2021-04-15 news.ycombinator.com/item?id=26801067 (96 points) on Hacker News. Reached position 16 at one point: archive.ph/L0Fte and led to about 5k views total. Ah, Ciro could watch that Google Analytics realtime view go bling all day long. Narcissism is a bitch.
- 2021 cryptonewmedia.press/tankman-image-on-bitcoin-blockchain/ by user igadjeed
- 2022-01-23 news.ycombinator.com/item?id=30050479 "Abuse and Harassment on the Blockchain", comment-mid thread
- 2022-01-24 www.reddit.com/r/Buttcoin/comments/sbw0se/when_i_heard_about_nfts_i_thought_they_were/hu2uk8g "When I heard about NFTs, I thought they were stupid, but then I watched a video explaining how they work, it really changed my perspective", comment mid-thread
- 2023-02 lots of Twitter backlinks as a result of ordinal ruleset inscriptions:
- 2023-02-03
Video 1. . Source. Features Marijuana plant and Rickrolling sections. He seems to be a finance guru. - 2023-02-07 twitter.com/privateid_ntity/status/1622814063331004421
- 2024-01-18 twitter.com/pete_rizzo_/status/1748049913286447355 by Rizzo, The Bitcoin Historian (81k followers, mid-thread)
- 2024-12-29: x.com/lopp/status/1873453363523932630 by Jameson Lopp (492k subscribers)
- ? cloudhiker.net/ A hand curated and categorized list of interesting links by Kevin Woblick. Only allows users to visit a random one per category, so we can't get proof of backlink, this was noticed through Google Analytics.
- 2025-03-18 Bitcoin Burn Addresses: Unveiling the Permanent Losses and Their Underlying Causes
The mention of this project is brief:Announced at:Ciro Santilli maintains a Web page listing arbitrary data embedded in the Bitcoin blockchain. This is the most complete and up-to-date list of arbitrary data we are aware of. However, he does not specifically focus on burn addresses, but on the stored contents.
- 2025-03-28 x.com/punk3606/status/1905295370227155344 quick "this was already discovered" mention on thread x.com/I____felix____I/status/1905291048798106061 where a dude rediscovers Figure "Warren Buffet"
- 2025-07-20 www.youtube.com/watch?v=oFrK2tpat8c Uncovering Hidden Messages in Bitcoin by Th3M0rn1ng5h0w
Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
Figure 3. Visual Studio Code extension installation.Figure 4. Visual Studio Code extension tree navigation.Figure 5. Web editor. You can also edit articles on the Web editor without installing anything locally.Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.Video 4. OurBigBook Visual Studio Code extension editing and navigation demo. Source. - Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact






