Ciro Santilli invented this term, derived from "hardware in the loop" to refer to simulations in which both the brain and the body and physical world of organism models are modelled.
E.g. just imagine running:
Likely implies whole brain emulation and therefore AGI.
It's like the bootloader stage of biology! It's weird and magic and important: Section "Molecular biology feels like systems programming".
CIA 2010 covert communication websites by
Ciro Santilli 35 Updated 2025-04-18 +Created 1970-01-01
This article is about covert agent communication channel websites used by the CIA in many countries from the late 2000s until the early 2010s, when they were uncovered by counter intelligence of the targeted countries circa 2011-2013. This discovery led to the imprisonment and execution of several assets in Iran and China, and subsequent shutdown of the channel.
The existence of such websites was first reported in November 2018 by Yahoo News: www.yahoo.com/video/cias-communications-suffered-catastrophic-compromise-started-iran-090018710.html.
Previous whispers had been heard in 2017 but without clear mention of websites: www.nytimes.com/2017/05/20/world/asia/china-cia-spies-espionage.html:
Most notably, starting in 2008, CIA contractor John Reidy started raising concerns about the security of the communication systems used, but he was silenced and ignored, leading to catastrophe.[ref][ref]
Then in September 2022 a few specific websites were finally reported by Reuters: www.reuters.com/investigates/special-report/usa-spies-iran/, henceforth known only as "the Reuters article" in this article.
Banner of the Reuters article
. Source. Ciro Santilli heard about the 2018 article at around 2020 while studying for his China campaign because the websites had been used to take down the Chinese CIA network in China. He even asked on Quora: www.quora.com/What-were-some-examples-of-the-websites-that-the-CIA-used-around-2010-as-a-communication-mechanism-for-its-spies-in-China-and-Iran-but-were-later-found-and-used-to-take-down-their-spy-networks but there were no publicly known domains at the time to serve as a starting point. Chris, Electrical Engineer and former Avionics Tech in the US Navy, even replied suggesting that obviously the CIA is so competent that it would never ever have its sites leaked like that:
Seriously a dumb question.
So when Ciro Santilli heard about the 2022 article almost a year after publication, and being a half-arsed web developer himself, he knew he had to try and find some of the domains himself using the newly available information! It was an irresistible real-life capture the flag. The thing is, everyone who has ever developed a website knows that its attack surface is about the size of Texas, and the potential for fingerprinting is off the charts with so many bits and pieces sticking out. Chris, get fucked.
In particular, it is fun to have such a clear and visible to anyone examples of the USA spying on its own allies in the form of Wayback Machine archives.
Given that it was reported that there were "more than 350" such websites, it would be really cool if we could uncover more of those websites ourselves beyond the 9 domains reported by Reuters!
This article documents the list of extremely likely candidates Ciro has found so far, mostly using:more details on methods also follow. It is still far from the 885 websites reported by citizenlabs, so there must be key techniques missing. But the fact that there are no Google Search hits for the domains or IPs (except in bulk e.g. in expired domain trackers) indicates that these might not have been previously clearly publicly disclosed.
- rudimentary IP range search on viewdns.info starting from the websites reported by Reuters
- heuristic search for keywords in domains of the 2013 DNS Census plus Wayback Machine CDX scanning
If anyone can find others, or has better techniques: Section "How to contact Ciro Santilli". The techniques used so far have been very heuristic, and that added to the limited amount of data makes it almost certain that several IP ranges have been missed. There are two types of contributions that would be possible:Perhaps the current heuristically obtained data can serve as a good starting for a more data-oriented search that will eventually find a valuable fingerprint which brings the entire network out.
- finding new IP ranges: harder more exiting, and potentially requires more intelligence
- better IP to domain name databases to fill in known gaps in existing IP ranges
Disclaimer: the network fell in 2013, followed by fully public disclosures in 2018 and 2022, so we believe it is now more than safe for the public to know what can still be uncovered about the events that took place. The main author's political bias is strongly pro-democracy and anti-dictatorship.
May this list serve as a tribute to those who spent their days making, using, and uncovering these websites under the shadows.
If you want to go into one of the best OSINT CTFs of your life, stop reading now and see how many Web Archives you can find starting only from the Reuters article as Ciro did. Some guidelines:
- there was no ultra-clean fingerprint found yet. Some intuitive and somewhat guessy data analysis was needed. But when you clean the data correctly and make good guesses, many hits follow, it feels so good
- nothing was paid for data. But using cybercafe Wifi's for a few extra IPs may help.
viewdns.info
. Source. activegameinfo.com
domain to IPviewdns.info
. Source. aroundthemiddleeast.com
IP to domainamazon.com,2012-02-01T21:33:36,72.21.194.1
amazon.com,2012-02-01T21:33:36,72.21.211.176
amazon.com,2013-10-02T19:03:39,72.21.194.212
amazon.com,2013-10-02T19:03:39,72.21.215.232
amazon.com.au,2012-02-10T08:03:38,207.171.166.22
amazon.com.au,2012-02-10T08:03:38,72.21.206.80
google.com,2012-01-28T05:33:40,74.125.159.103
google.com,2012-01-28T05:33:40,74.125.159.104
google.com,2013-10-02T19:02:35,74.125.239.41
google.com,2013-10-02T19:02:35,74.125.239.46
The four communication mechanisms used by the CIA websites
. Java Applets, Adobe Flash, JavaScript and HTTPSYou can never have enough Wayback Machine tabs open
. This is how the end of the fingerprint pipeline looks like: as many tabs as you have the patience to go through one by one!Expired domain names by day 2011
. Source. The scraping of expired domain trackers to Github was one of the positive outcomes of this project.Compromised Comms by Darknet Diaries (2023)
Source. It was the YouTube suggestion for this video that made Ciro Santilli aware of the Reuters article almost one year after its publication, which kickstarted his research on the topic.
Full podcast transcript: darknetdiaries.com/transcript/75/
Ciro Santilli pinged the Podcast's host Jack Rhysider on Twitter and he ACK'ed which is cool, though he was skeptical about the strength of the fingerprints found, and didn't reply when clarification was offered. Perhaps the material is just not impactful enough for him to produce any new content based on it. Or also perhaps it comes too close to sources and methods for his own good as a presumably American citizen.
CLI program implementing Universal Chess Interface: www.reddit.com/r/ComputerChess/comments/b6rdez/commandline_options_for_stockfish/
How to actually play against it: chess.stackexchange.com/questions/4353/how-to-install-stockfish-on-ubuntu So hard!
The user friendly Chess UI! Exactly what you would expect from a GNOME Project package. But also packs some punch via the Universal Chess Interface, e.g. Stockfish just works.
Both chess engine and a CLI chess UI. As an engine it is likely irrelevant compared to Stockfish as of 2020. TODO: does the UI support Universal Chess Interface?
Cool project history though. Started before the GNU Project itself, and became one of the first packages.
Advanced. Not beginner friendly, very clunky.
To be fair, this is one of the least worse ones.
The cool thing about this notation is that is showed to Ciro Santilli that there is more state to a chess game than just the board itself! Notably:plus some other boring draw rules counters.
- whose move it is next
- castling availability
- en passant availability
70,000 28x28 grayscale (1 byte per pixel) images of hand-written digits 0-9, i.e. 10 categories. 60k are considered training data, 10k are considered for test data.
This is THE "OG" computer vision dataset.
Playing with it is the de-facto computer vision hello world.
It was on this dataset that Yann LeCun made great progress with the LeNet model. Running LeNet on MNIST has to be the most classic computer vision thing ever. See e.g. activatedgeek/LeNet-5 for a minimal and modern PyTorch educational implementation.
But it is important to note that as of the 2010's, the benchmark had become too easy for many applications. It is perhaps fair to say that the next big dataset revolution of the same importance was with ImageNet.
The dataset could be downloaded from yann.lecun.com/exdb/mnist/ but as of March 2025 it was down and seems to have broken from time to time randomly, so Wayback Machine to the rescue:but doing so is kind of pointless as both files use some crazy single-file custom binary format to store all images and labels. OMG!
wget \
https://web.archive.org/web/20120828222752/http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \
https://web.archive.org/web/20120828182504/http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \
https://web.archive.org/web/20240323235739/http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz \
https://web.archive.org/web/20240328174015/http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
OK-ish data explorer: knowyourdata-tfds.withgoogle.com/#tab=STATS&dataset=mnist
TODO where to find it: www.kaggle.com/general/50987
14 million images with more than 20k categories, typically denoting prominent objects in the image, either common daily objects, or a wild range of animals. About 1 million of them also have bounding boxes for the objects. The images have different sizes, they are not all standardized to a single size like MNIST[ref].
Each image appears to have a single label associated to it. Care must have been taken somehow with categories, since some images contain severl possible objects, e.g. a person and some object.
Official project page: www.image-net.org/
The data license is restrictive and forbids commercial usage: www.image-net.org/download.php. Also as a result you have to login to download the dataset. Super annoying.
How to visualize: datascience.stackexchange.com/questions/111756/where-can-i-view-the-imagenet-classes-as-a-hierarchy-on-wordnet
There are unlisted articles, also show them or only show them.