CIA 2010 covert communication websites by Ciro Santilli 37 Updated +Created
This article is about covert agent communication channel websites used by the CIA in many countries from the late 2000s until the early 2010s, when they were uncovered by counter intelligence of the targeted countries circa 2010-2013.
This article uses publicly available information to publicly disclose for the first time a few hundred of what we feel are extremely likely candidate sites of the network. The starting point for this research was the September 2022 Reuters article "America’s Throwaway Spies" which for the first time gave nine example websites, and their analyst from Citizenlabs claims to have found 885 websites in total, but did not publicly disclose them. Starting from only the nine disclosed websites, we were then able to find a few hundred websites that share so many similarities with them, i.e. a common fingerprint, that we believe makes them beyond reasonable doubt part of the same network.
If you enjoy this article, consider dropping some Monero at: 4A1KK4uyLQX7EBgN7uFgUeGt6PPksi91e87xobNq7bT2j4V6LqZHKnkGJTUuCC7TjDNnKpxDd8b9DeNBpSxim8wpSczQvzf. Other sponsorship methods: Section "Sponsor Ciro Santilli's work on OurBigBook.com".
https://raw.githubusercontent.com/cirosantilli/media/master/CIA_Star_Wars_website_promo.jpg
Video 1.
How I found a Star Wars website made by the CIA by Ciro Santilli
. Source. Slightly edited VOD of the talk Aratu Week 2024 Talk by Ciro Santilli: My Best Random Projects.
The discovery of these websites by Iranian and Chinese counterintelligence led to the imprisonment and execution of several assets in those countries, and subsequent shutdown of the channel by the CIA when they noticed that things had gone wrong. This is likely a Wikipedia page that talks about the disastrous outcome of the websites being found out: 2010–2012 killing of CIA sources in China, although it contained no mention of websites before Ciro Santilli edited it in.
Of particular interest is that based on their language and content, certain of the websites seem to have targeted other democracies such as Germany, France, Spain and Brazil.
If anyone can find others websites, or has better techniques feel free to contact Ciro Santilli at: Section "How to contact Ciro Santilli". Contributions will be clearly attributed if desired. Some of the techniques used so far have been very heuristic, and that added to the limited amount of data makes it almost certain that some websites have been missed. Broadly speaking, there are two types of contributions that would be possible:
The fact that citizenlabs reported exactly 885 websites being found makes it feel like they might have found find a better fingerprint which we have not managed to find yet. We have not yet had to pay for our data.
Disclaimers:
May this article serve as a tribute to those who spent their days making, using, and uncovering these websites under the shadows.
CPU architecture by Ciro Santilli 37 Updated +Created
Leela Chess Zero by Ciro Santilli 37 Updated +Created
Deep learning implementation, a bit analogous to AlphaZero, but for chess only.
GNOME Chess by Ciro Santilli 37 Updated +Created
The user friendly Chess UI! Exactly what you would expect from a GNOME Project package. But also packs some punch via the Universal Chess Interface, e.g. Stockfish just works.
GNU Chess by Ciro Santilli 37 Updated +Created
Both chess engine and a CLI chess UI. As an engine it is likely irrelevant compared to Stockfish as of 2020. TODO: does the UI support Universal Chess Interface?
Cool project history though. Started before the GNU Project itself, and became one of the first packages.
Shane's Chess Information Database by Ciro Santilli 37 Updated +Created
Advanced. Not beginner friendly, very clunky.
Protestantism by Ciro Santilli 37 Updated +Created
Kaggle by Ciro Santilli 37 Updated +Created
To be fair, this is one of the least worse ones.
Forsyth-Edwards Notation by Ciro Santilli 37 Updated +Created
The cool thing about this notation is that is showed to Ciro Santilli that there is more state to a chess game than just the board itself! Notably:
  • whose move it is next
  • castling availability
  • en passant availability
plus some other boring draw rules counters.
Computerphile by Ciro Santilli 37 Updated +Created
MNIST database by Ciro Santilli 37 Updated +Created
70,000 28x28 grayscale (1 byte per pixel) images of hand-written digits 0-9, i.e. 10 categories. 60k are considered training data, 10k are considered for test data.
Playing with it is the de-facto computer vision hello world.
It was on this dataset that Yann LeCun made great progress with the LeNet model. Running LeNet on MNIST has to be the most classic computer vision thing ever. See e.g. activatedgeek/LeNet-5 for a minimal and modern PyTorch educational implementation.
But it is important to note that as of the 2010's, the benchmark had become too easy for many applications. It is perhaps fair to say that the next big dataset revolution of the same importance was with ImageNet.
The dataset could be downloaded from yann.lecun.com/exdb/mnist/ but as of March 2025 it was down and seems to have broken from time to time randomly, so Wayback Machine to the rescue:
wget \
 https://web.archive.org/web/20120828222752/http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \
 https://web.archive.org/web/20120828182504/http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \
 https://web.archive.org/web/20240323235739/http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz \
 https://web.archive.org/web/20240328174015/http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
but doing so is kind of pointless as both files use some crazy single-file custom binary format to store all images and labels. OMG!
Figure 1.
MNIST image 1 of a '0'
.
Figure 2.
MNIST image 21 of a '0'
.
Figure 3.
MNIST image 3 of a '1'
.
CIFAR-10 by Ciro Santilli 37 Updated +Created
60,000 tiny 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
TODO release date.
This dataset can be thought of as an intermediate between the simplicity of MNIST, and a more full blown ImageNet.
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/airplane1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/automobile1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird1.png
https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/cat1.png
ImageNet by Ciro Santilli 37 Updated +Created
14 million images with more than 20k categories, typically denoting prominent objects in the image, either common daily objects, or a wild range of animals. About 1 million of them also have bounding boxes for the objects. The images have different sizes, they are not all standardized to a single size like MNIST[ref].
Each image appears to have a single label associated to it. Care must have been taken somehow with categories, since some images contain severl possible objects, e.g. a person and some object.
In practice, the ILSVRC subset of ImageNet is the most commonly used dataset.
Official project page: www.image-net.org/
The data license is restrictive and forbids commercial usage: www.image-net.org/download.php. Also as a result you have to login to download the dataset. Super annoying.
The categories are all part of WordNet, which means that there are several parent/child categories such as dog vs type of dog available. ImageNet1k only appears to have leaf nodes however (i.e. no "dog" label, just specific types of dog).
A major model that performed well on ImageNet starting on 2012 and became notable is AlexNet.
COCO dataset by Ciro Santilli 37 Updated +Created
From cocodataset.org/:
  • 330K images (>200K labeled)
  • 1.5 million object instances
  • 80 object categories
  • 91 stuff categories
  • 5 captions per image. A caption is a short textual description of the image.
So they have relatively few object labels, but their focus seems to be putting a bunch of objects on the same image. E.g. they have 13 cat plus pizza photos. Searching for such weird combinations is kind of fun.
Their official dataset explorer is actually good: cocodataset.org/#explore
And the objects don't just have bounding boxes, but detailed polygons.
Also, images have captions describing the relation between objects:
a black and white cat standing on a table next to a pizza.
Epic.
This dataset is kind of cool.
Open Images dataset by Ciro Santilli 37 Updated +Created
As of v7:
The images and annotations are both under CC BY, with Google as the copyright holder.
Allen brain atlas by Ciro Santilli 37 Updated +Created
Connectome scale by Ciro Santilli 37 Updated +Created
A Drosophila melanogaster has about 135k neurons, and we only managed to reconstruct its connectome in 2023.
The human brain has 86 billion neurons, about 1 million times more. Therefore, it is obvious that we are very very far away from a full connectome.
Instead however, we could look at larger scales of connectome, and then try from that to extract modules, and then reverse engineer things module by module.
This is likely how we are going to "understand how the human brain works".

Unlisted articles are being shown, click here to show only listed articles.