- owns the entire stack and creates high quality highly optimized systems
- creates closed lock-in systems without inter-operability and actively fights users from owning their devices
- do they give back enough to open source, or do they leech mostly?
Prelude: initial reports without specific websites (2018-) Updated 2024-12-15 +Created 2024-09-26
This article is about covert agent communication channel websites used by the CIA in many countries from the late 2000s until the early 2010s, when they were uncovered by counter intelligence of the targeted countries circa 2011-2013. This discovery led to the imprisonment and execution of several assets in Iran and China, and subsequent shutdown of the channel.
The existence of such websites was first reported in November 2018 by Yahoo News: www.yahoo.com/video/cias-communications-suffered-catastrophic-compromise-started-iran-090018710.html.
Previous whispers had been heard in 2017 but without clear mention of websites: www.nytimes.com/2017/05/20/world/asia/china-cia-spies-espionage.html:
Some were convinced that a mole within the C.I.A. had betrayed the United States. Others believed that the Chinese had hacked the covert system the C.I.A. used to communicate with its foreign sources. Years later, that debate remains unresolved.[...]From the final weeks of 2010 through the end of 2012, [...] the Chinese killed at least a dozen of the C.I.A.’s sources. [...] One was shot in front of his colleagues in the courtyard of a government building — a message to others who might have been working for the C.I.A.
Then in September 2022 a few specific websites were finally reported by Reuters: www.reuters.com/investigates/special-report/usa-spies-iran/, henceforth known only as "the Reuters article" in this article.
Ciro Santilli heard about the 2018 article at around 2020 while studying for his China campaign because the websites had been used to take down the Chinese CIA network in China. He even asked on Quora: www.quora.com/What-were-some-examples-of-the-websites-that-the-CIA-used-around-2010-as-a-communication-mechanism-for-its-spies-in-China-and-Iran-but-were-later-found-and-used-to-take-down-their-spy-networks but there were no publicly known domains at the time to serve as a starting point. Chris, Electrical Engineer and former Avionics Tech in the US Navy, even replied suggesting that obviously the CIA is so competent that it would never ever have its sites leaked like that:
Seriously a dumb question.
So when Ciro Santilli heard about the 2022 article almost a year after publication, and being a half-arsed web developer himself, he knew he had to try and find some of the domains himself using the newly available information! It was an irresistible real-life capture the flag. The thing is, everyone who has ever developed a website knows that its attack surface is about the size of Texas, and the potential for fingerprinting is off the charts with so many bits and pieces sticking out. Chris, get fucked.
In particular, it is fun to have such a clear and visible to anyone examples of the USA spying on its own allies in the form of Wayback Machine archives.
Given that it was reported that there were "more than 350" such websites, it would be really cool if we could uncover more of those websites ourselves beyond the 9 domains reported by Reuters!
This article documents the list of extremely likely candidates Ciro has found so far, mostly using:more details on methods also follow. It is still far from the 885 websites reported by citizenlabs, so there must be key techniques missing. But the fact that there are no Google Search hits for the domains or IPs (except in bulk e.g. in expired domain trackers) indicates that these might not have been previously clearly publicly disclosed.
- rudimentary IP range search on viewdns.info starting from the websites reported by Reuters
- heuristic search for keywords in domains of the 2013 DNS Census plus Wayback Machine CDX scanning
If anyone can find others, or has better techniques: Section "How to contact Ciro Santilli". The techniques used so far have been very heuristic, and that added to the limited amount of data makes it almost certain that several IP ranges have been missed. There are two types of contributions that would be possible:Perhaps the current heuristically obtained data can serve as a good starting for a more data-oriented search that will eventually find a valuable fingerprint which brings the entire network out.
- finding new IP ranges: harder more exiting, and potentially requires more intelligence
- better IP to domain name databases to fill in known gaps in existing IP ranges
Disclaimer: the network fell in 2013, followed by fully public disclosures in 2018 and 2022, so we believe it is now more than safe for the public to know what can still be uncovered about the events that took place. The main author's political bias is strongly pro-democracy and anti-dictatorship.
May this list serve as a tribute to those who spent their days making, using, and uncovering these websites under the shadows.
If you want to go into one of the best OSINT CTFs of your life, stop reading now and see how many Web Archives you can find starting only from the Reuters article as Ciro did. Some guidelines:
- there was no ultra-clean fingerprint found yet. Some intuitive and somewhat guessy data analysis was needed. But when you clean the data correctly and make good guesses, many hits follow, it feels so good
- nothing was paid for data. But using cybercafe Wifi's for a few extra IPs may help.
Ciro Santilli has a bad memory for events that happened a medium time ago, for example in order of months/years. Especially if they are one-off things that have no relation to anything else.
For example, Ciro never remembers which places he travelled to just once, and who was in each trip! He has images of several places he travelled to in his head, and would recognize them, but he just doesn't know where they were!
Another example, Ciro was looking at the carpet at their house, and asked where it came from. His wife replied immeidately: from Bercy shopping quarter in Paris about 10 years ago, and you took it on your back for a long walk until we could find the bus back home because we were concerned it wouldn't fit in the train!
The same goes for scenes from movies and passages from music, which explains why Ciro's art consumption focuses on innovative discrete "what happened" and "general gist" ideas, rather than, analog details such as colors and shapes.
Going back even further in time, Ciro starts to forget the less close friends he had, because the events start to fade away.
Paradoxically however, Ciro believes that this bad memory is one of his greatest strengths and key defining characteristics, because it leads Ciro to want to write down every interesting thing he learns, which motivated OurBigBook.com and his Stack Overflow contributions and his related Ciro Santilli's documentation superpowers.
It also somewhat leads Ciro to like physics and mathematics, because in these fields you "can deduce everything" from very few base principles, so if you forget them, it does not matter that much as you can re-deduce stuff over and over. Which is somewhat where the high flying bird attitude comes from. It is hard to go deep when you have to re-prove everything every time. But the upside is that anything that sticks, does so because it has a broad net to stick to, and therefore allows Ciro to make unusual and unexpected connections that others might not.
Ciro believes that there are two types of people, and most notably software engineers, which are basically data wranglers: those with bad memory and those with good memory.
Those with bad memory, tend to focus on automating and improving their processes a lot. They take much longer to do one-off specific deep knowledge tasks however.
The downside of the good memory ones is that sooner or later they will find tasks that no matter how much memory they have, they cannot solve without automation, and they will fail at those.
Also, good memory people don't enable others to join the project efficiently as much.
This dichotomy also explains why Ciro sucks at code reviews, but is rather the person who runs the interesting patches by himself and finds some critical problems that the more theoretical code reviewers missed.
If Ciro had become a scientist, he would without doubt be an experimentalist, just like in this reality he is a GDB/runtime person rather than a "static source analysis" person. Those who have bad memory prefer to just run experiments over and over and observe system state at runtime.
Other effects of having a bad memory include:
- code duplication, or a constant fear of it at least, because Ciro forgets that some functionality exists already
- meeting aversion, because everything that is not recorded will fade away
- passion for backward design, because by the time a piece of knowledge learnt in school might be useful (and 99.99% won't), it will have been long forgotten
Related: jakobschwichtenberg.com/about/ from Jakob Schwichtenberg:
I'm a physicist and I try to write down things during my own learning process.In some sense, one of the biggest benefits I have over other people in physics is that I'm certainly not the smartest guy! I usually can't grasp complex issues very easily. So I have to break down complex ideas into smaller chunks to understand it myself. This means, whenever I describe something to others, everyone understands, because it's broken down into such simple terms.
On C2 wiki, therefore it cannot be wrong wiki.c2.com/?QuasiGreatTeacher:
Some people have learning disabilities, [... bullshit ...]. A lot of classic spiritual texts have been produced this way. Basically, the stupidest but most dogged disciple, if he has a neurotic habit of writing things down, will make the best teacher for the third and subsequent generations.
This is a good thing. It basically contains an entire website, with HTML and assets inside a single ZIP, and a little bit of metadata.
It is incomprehensible why browsers don't just implement it as they already have all the web part, and also ZIP stuff:
The situation is so sad. Ubuntu 21.04 doesn't come with a reader installed by default:
FFmpeg is likely the backend of YouTube through reverse engineering: streaminglearningcenter.com/blogs/youtube-uses-ffmpeg-for-encoding.html (archive)
The Quora question: www.quora.com/Are-there-any-PhD-programs-in-training-an-AI-system-to-play-computer-games-Like-the-work-DeepMind-do-combining-Reinforcement-Learning-with-Deep-Learning-so-the-AI-can-play-Atari-games
A good way to find labs is to go down the issues section of projects such as:and then stalk them to see where they are doing their PhDs.
A computer is a highly layered system, and so you have to decide which layers you are the most interested in studying.
Although the layer are somewhat independent, they also sometimes interact, and when that happens it usually hurts your brain. E.g., if compilers were perfect, no one optimizing software would have to know anything about microarchitecture. But if you want to go hardcore enough, you might have to learn some lower layer.
It must also be said that like in any industry, certain layers are hidden in commercial secrecy mysteries making it harder to actually learn them. In computing, the lower level you go, the more closed source things tend to become.
But as you climb down into the abyss of low level hardcoreness, don't forget that making usefulness is more important than being hardcore: Figure 1. "xkcd 378: Real Programmers".
First, the most important thing you should know about this subject: cirosantilli.com/linux-kernel-module-cheat/should-you-waste-your-life-with-systems-programming
Here's a summary from low-level to high-level:
- semiconductor physical implementation this level is of course the most closed, but it is fun to try and peek into it from any openings given by commercials and academia:
- photolithography, and notably photomask design
- register transfer level
- interactive Verilator fun: Is it possible to do interactive user input and output simulation in VHDL or Verilog?
- more importantly, and much harder/maybe impossible with open source, would be to try and set up a open source standard cell library and supporting software to obtain power, performance and area estimates
- Are there good open source standard cell libraries to learn IC synthesis with EDA tools? on Quora
- the most open source ones are some initiatives targeting FPGAs, e.g. symbiflow.github.io/, www.clifford.at/icestorm/
- qflow is an initiative targeting actual integrated circuits
- microarchitecture: a good way to play with this is to try and run some minimal userland examples on gem5 userland simulation with logging, e.g. see on the Linux Kernel Module Cheat:This should be done at the same time as books/website/courses that explain the microarchitecture basics.This is the level of abstraction that Ciro Santilli finds the most interesting of the hardware stack. Learning it for actual CPUs (which as of 2020 is only partially documented by vendors) could actually be useful in hardcore software optimization use cases.
- instruction set architecture: a good approach to learn this is to manually write some userland assembly with assertions as done in the Linux Kernel Module Cheat e.g. at:
- github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/x86_64/add.S
- cirosantilli.com/linux-kernel-module-cheat/x86-userland-assembly
- learn a bit about calling conventions, e.g. by calling C standard library functions from assembly:
- you can also try and understand what some simple C programs compile to. Things can get a bit hard though when
-O3
is used. Some cute examples:
- executable file format, notably executable and Linkable Format. Particularly important is to understand the basics of:
- address relocation: How do linkers and address relocation work?
- position independent code: What is the -fPIE option for position-independent executables in GCC and ld?
- how to observe which symbols are present in object files, e.g.:
- how C++ uses name mangling What is the effect of extern "C" in C++?
- how C++ template instantiation can help reduce link time and size: Explicit template instantiation - when is it used?
- operating system. There are two ways to approach this:
- learn about the Linux kernel Linux kernel. A good starting point is to learn about its main interfaces. This is well shown at Linux Kernel Module Cheat:
- system calls
- write some system calls in
- pure assembly:
- C GCC inline assembly:
- write some system calls in
- learn about kernel modules and their interfaces. Notably, learn about to demystify special files such
/dev/random
and so on: - learn how to do a minimal Linux kernel disk image/boot to userland hello world: What is the smallest possible Linux implementation?
- learn how to GDB Step debug the Linux kernel itself. Once you know this, you will feel that "given enough patience, I could understand anything that I wanted about the kernel", and you can then proceed to not learn almost anything about it and carry on with your life
- system calls
- write your own (mini-) OS, or study a minimal educational OS, e.g. as in:
- learn about the Linux kernel Linux kernel. A good starting point is to learn about its main interfaces. This is well shown at Linux Kernel Module Cheat:
- programming language
Local symmetries appear to be a synonym to internal symmetry, see description at: Section "Internal and spacetime symmetries".
As mentioned at Quote , local symmetries map to forces in the Standard Model.
Appears to be a synonym for: gauge symmetry.
A local symmetry is a transformation that you apply a different transformation for each point, instead of a single transformation for every point.
TODO what's the point of a local symmetry?
Bibliography:
- lecture 3
- physics.stackexchange.com/questions/48188/local-and-global-symmetries
- www.physics.rutgers.edu/grad/618/lects/localsym.pdf by Joel Shapiro gives one nice high level intuitive idea:
In relativistic physics, global objects are awkward because the finite velocity with which effects can propagate is expressed naturally in terms of local objects. For this reason high energy physics is expressed in terms of a field theory.
- Quora:
Nice looking and expensive operating system by Apple. Ciro Santilli believes that:
- if you want to be ripped off, just use Microsoft Windows which has more software available
- or if you want to attain Enlightenment, just use Linux, which is free and open source
The story of how OS X was ported to x86 from PowerPC with large initial work up to boot by a single man in the year 2000, John Kullmann, is really worth reading: www.quora.com/Apple-company/How-does-Apple-keep-secrets-so-well/answer/Kim-Scheinberg on Quora, see also:
Maxwell's equations imply that the speed of light is the same for all inertial reference frames Updated 2024-12-15 +Created 1970-01-01
Molecular Sciences Course of the University of São Paulo Updated 2024-12-15 +Created 1970-01-01
Good Portuguese overview: www.scielo.br/scielo.php?script=sci_arttext&pid=S1806-11172017000300301&lng=pt&tlng=pt
A fantastic sounding full time 4-year course that any student could transfer to called that teaches various natural science topics, notably mathematics, physics, chemistry and molecular biology.
Many past students Ciro talked to however share a common frustration with the course: in the first 2 years at least, the "basic cycle", you have infinitely many courses, and no time to study, and no choice of what to study, it is only in the latter 2 years (the advanced cycle) that you get the choices.
Also, if you get low grades in a single subject, your out. And exams are useless of course.
Here's a Quora question in Portuguese about the course: pt.quora.com/Como-funciona-o-tal-do-curso-secreto-da-USP, the only decent answer so far being: pt.quora.com/Como-funciona-o-tal-do-curso-secreto-da-USP/answer/Victor-Soares-31. Very disappointing to hear.
On the advanced cycle, you have a lot of academic freedom. You are basically supposed to pick a research project with an advisor and go for it, with a small amount of mandatory course hours. Ciro was told in 2022 that you can even have advisors from other universities or industry, and that it is perfectly feasible to take courses in another university and validate the course hours later on. Fantastic!!!
Students from the entire University of São Paulo can apply to transfer to it only after joining the university, with the guarantee that they can go back to their original courses if they don't adapt to the new course, which is great!
Not doing it is one of Ciro Santilli's regrets in life, see also: don't be a pussy.
Around 2007, they were in a really shady building of the University, but when Ciro checked in 2021, they had apparently moved to a shiny new entrepreneurship-focused building. Fantastic news!!!
One of the Brazilians who came to École Polytechnique together with Ciro was from this course. The fact that he is one of the most intelligent people Ciro knows gave further credit to that course in his eyes.
And the articles that really matter:
Web of Stories contains amazing interviews with many (mostly American) winners.
See Surely You're Joking, Mr. Feynman chapter Alfred Nobel's Other Mistake's amazing comments about the Nobel Prize.
TODO who is the digital switch person he mentions?
- www.quora.com/unanswered/Who-was-Richard-Feynman-referring-to-in-the-book-Surely-Youre-Joking-Mr-Feynman-chapter-Alfred-Nobels-Other-Mistake-when-he-talks-about-A-friend-of-mine-whos-a-rich-man-he-invented-some-kind-of-simple-digital-switch on Quora
- github.com/cirosantilli/cirosantilli.github.io/issues/72
These are websites that offer somewhat overlapping services, many of which served inspirations, and why we think something different is needed to achieve our goals.
Notably, OurBigBook is the result of Ciro Santilli's experiences with:OurBigBook could be seen as a cross between those three websites.
- Wikipedia
- GitHub
- Stack Exchange (or as non techies might point out, Urban Dictionary, or Quora before it was such an incomprehensible shitshow)
Quick mentions:
- handwiki.org/wiki/HandWiki:About: technically the same as Wikipedia, but with more aligned moderation policies
- ecotext.co/ similar goals. Their website seems quite broken now though as of 2021, can't see text properly. Crunchbase entry: www.crunchbase.com/organization/ecotext says they are from Durham, New Hampshire, United States. Cannot see how to publish, curated material only? Twitter: twitter.com/ecotextinc?lang=en One of the founders: twitter.com/BigNel_21 | www.linkedin.com/in/ecotextnelsonthomas/. Their LinkedIn: www.linkedin.com/company/ecotext/people/
- fiveable.me/ bad: separates students and teachers, as a student I don't see where to create my content. Good: focus on teaching university level stuff to people outside of university via Advanced Placement. Bad: Lots of video content. Bad: Can't see the issue tracker attached to each page.
- LessWrong: their website system does have some similar feature sets to what we want. Reputation, Q&A sections, links between articles most likely, sort by upvote everywhere.
- crowdpub.org collaborative writing website, somehow goes to paragraph level, TODO how they reconcile different authors? Closed beta as of writing, so hard to be sure. From quick presentation on beta website, appears to attempt to share revenue to authors proportionally to the size of their contribution. Some blockchain-based reputation. Meh.
- TODO migrate all from: github.com/booktree/booktree/blob/master/alternatives.md
- studynotes.ie/. Admin approval on everything. No ToC. Fixed tag list for university entry exams topics.
- mindstone.com: there appears to be no sharing focus? File upload basesd? Not sure.
- EverybodyWiki
- looking for open source Confluence-alternatives is an interesting way to go:
- lists:
- BookStack:
- fixed 3-level page hierarchy
- writen in PHP
- Markdown support: www.bookstackapp.com/docs/user/markdown-editor/
- no source-level import-export apparently: www.bookstackapp.com/docs/admin/backup-restore/, youtu.be/WUvtzJfCAKE?t=904
- WYSIWYG: www.bookstackapp.com/docs/user/wysiwyg-editor/ via TinyMCE
- page content repeating: www.bookstackapp.com/docs/user/reusing-page-content/ (will be useful for course modelling)
- github.com/shuding/nextra converts Markdown links to Next.js links. We should look into how it works.
- zettelkasten.de/the-archive/ "The Archive" from zettelkasten.de/. Closed source. By German software engineer Christian Tietze twitter.com/ctietze?lang=en
- LLM generated wiki e.g.:
- docs.tigyog.app/cli beautiful website, but doesn't achieve much. Has a Markdown upload mechanism. Ah, those newbs who think the average user will care about markup upload to DB... Oh, wait...
- www.stuvia.com/en-gb/school/uk/oxford-university/physics. PDF uploads. In theory you have to own copyright: www.stuvia.com/en-gb/copyright/guidelines but it feels unlikely that most material was uploaded by the copyright owners. If those people are up, then why can't we? Maybe... Registred in the UK. People: some Dutch dudes:
- Project Xanadu: crazy overlaps, though that project is vaporware apparently?
Administrators of Project Xanadu have declared it superior to the World Wide Web, with the mission statement: "Today's popular software simulates paper. The World Wide Web (another imitation of paper) trivialises our original hypertext model with one-way ever-breaking links and no management of version or contents.
Static website-only alternatives:
- quarto.org/
- vitepress.dev. vitepress.dev/guide/markdown unmanaged internal links. Sample website: wiki.nikiv.dev/.
Conceptual:
- The Final Encyclopedia: science fiction concept, but the name was reused by Paul Allen in a research project
- second brain
- collective intelligence
Some possible/not possible sources that could be used to manually bootstrap content:
- LibreTexts. Good project. "Teacher-only-content" unfortunately as usual. But besides that fundamental flaw, they do exactly what we want to do in a sense.
- OpenStax: CC BY. This could be a great entry point, as they already have some university integration going on, and might be interested in this project.
- physics.stackexchange.com/questions/6157/list-of-freely-available-physics-books "List of freely available physics books" explicitly asks for:but the thread was locked, and basically none of the sources in the answers have free licenses, nor do they note it. It just seems that the physicists don't know what a free license is.
a list of physics books with open-source licenses, like Creative Commons, GPL
- MIT OpenCourseWare: CC BY-NC-SA, so not really usable
- github.com/certik/theoretical-physics: MIT License. Workable but wonky.
- subwiki.org/: wiki with some upper graduate math subjects presumably by this Indian dude: www.linkedin.com/in/vipul-naik-0ab1898/. Description on his homepage: vipulnaik.com/subwiki/. He's also got other interesting but not so relevant projects:He's also into Stack Overflow, Quora and Wikipedia editing. That's a cool dude. He's into in LessWrong it seems.
- pro freer immigration laws: vipulnaik.com/openborders/
- vipulnaik.com/cognito-mentoring/ free mentoring project for interested students
- massive mathematics books
- Infinite Napkin.CC BY-SA mathematics infinite book: github.com/vEnhance/napkin/issues/77. Very similar type of content to what we want in this project!
- Stacks Project
Existing lecture notes by students:
- github.com/mb2g17/NotesNetworkArchive Google Docs-based: docs.google.com/document/d/1OIcQ8dJ_FAhdkirU94M29-ZbNZ4oQs1LbWF3Nz-mq_U/edit#heading=h.vehxib58w1iw. An actual student uploading tons of lecture notes in one coherent system. CC BY-NC-SA unfortunately.
- academia.stackexchange.com/questions/148261/do-you-keep-your-study-notes-publicly-available mentions:Related: academia.stackexchange.com/questions/40381/how-common-is-it-that-professors-have-their-students-write-textbooks
- Cambridge Mathematics Lecture Notes by Dexter Chua (2014-2018)Comments:
Lecture note upload website:
- nexusnotes.com likely illegal reuploads of PDFs from teachers
- www.studocu.com/en-gb Paywall. PDF uploads. Unclear if simple teacher reuploads or actual novel notes.
- www.studydrive.net/
- Chinese GitHub repos. Some of these are very advanced in terms of content quantity and organizational quality! The Chinese are miles ahead in this area:
- github.com/PKUanonym/REKCARC-TSC-UHT Guidance for courses in Department of Computer Science and Technology, Tsinghua University. Chinese. Appears to try and store all past exams.
- github.com/lib-pku/libpku
- github.com/openwhu/OpenWHU: Wuhan University
- github.com/USTC-Resource/USTC-Course: USTC
- github.com/Zeal-L/UNSW: UNSW from Australia, but by a Chinese dude
- github.com/apachecn/mit-18.06-linalg-notes: translation of MIT course to Chinese
- github.com/chenyang1999/MyComputerCollegeCourses: TODO which univeresity
- github.com/elder-frog/OpenCourseCatalog: nothing to do with this project, but since I'm making a list, this dude is copying YouTube videos to Bilibili. And he's edgy anti-CCP on Twitter, what a legend.
- github.com/TheBloodthirster/BUAA_Course_Sharing: en.wikipedia.org/wiki/Beihang_University
- github.com/1051727403/SHU-CS-Source-Share: ShangHai University CS course source code
- github.com/Willie169/tw-gifted-k12-notes: Taiwanese high school notes
Exams uploads:
- questions.tripos.org/part-ib/all/ University of Cambridge Mathematics past examinations
Ciro Santilli's favorites so far:
Bibliography of the biliograpy:
- physics.stackexchange.com/questions/8441/what-is-a-complete-book-for-introductory-quantum-field-theory "What is a complete book for introductory quantum field theory?"
- www.quora.com/What-is-the-best-book-to-learn-quantum-field-theory-on-your-own on Quora
- www.amazon.co.uk/Lectures-Quantum-Field-Theory-Ashok-ebook/dp/B07CL8Y3KY
Recommendations by friend P. C.:
- The Global Approach to Quantum Field Theory
- Lecture Notes | Geometry and Quantum Field Theory | Mathematics ocw.mit.edu/courses/mathematics/18-238-geometry-and-quantum-field-theory-fall-2002/lecture-notes/
- Towards the mathematics of quantum field theory (Frederic Paugam)
- Path Integrals in Quantum Mechanics (J. Zinn–Justin)
- (B.Hall) Quantum Theory for Mathematicians (B.Hall)
- Quantum Field Theory and the Standard Model (Schwartz)
- The Algebra of Grand Unified Theories (John C. Baez)
- quantum Field Theory for The Gifted Amateur by Tom Lancaster (2015)