The term loosely refers to certain layers of the computer abstraction layers hierarchy, usually high level hardware internals like CPU pipeline, caching and the memory system. Basically exactly what gem5 models.
Sometimes Ciro Santilli regrets not having done a PhD. But this section makes him feel better about himself. To be fair, part of the merit is on him, part of the reason he didn't move on was the strong odour of bullshit oozing down to Masters level. A good PhH might have opened interesting job opportunities however, given that you don't really learn anything useful before that point in your education.
Rise to the Top: The Habits and Mindset of Top 0.1% PhD Students by Andy Stapleton (2023)
Source. Profzi scheme by PhD Comics
. A Ponzi scheme that trains people in new skills is not necessarily a terrible thing. It is a somewhat more useful version than standard exam based education.
Perhaps the problem is "forcing" 35 year olds to go down that path when they might also want to have boring stuff like families and security.
If people could get to the PhD level much, much sooner, it wouldn't be as obscene: Section "Students must be allowed to progress as fast as they want".
He quit Mozilla in 2016 to try and commercialize an
rr
extension called Pernosco.But as of 2022, he advertised himself as part of "Google Research", so maybe that went under, sample source: archive.ph/o9622. TODO when did he start? There's apparently an unrelated homonym: www.linkedin.com/in/rob-ocallahan/
He's apparently very religious, and very New Zelandish, twitter.com/rocallahan auto-describes:Terry A. Davis and D. Richard Hipp come to mind. One is tempted to speculate a correlation even, the proportion amongst systems programmers feels so much higher than in other areas of programming! Maybe it is because you have to be a God to do it in the first place.
Christian. Repatriate Kiwi.
Robert O'Callahan interview by Toby Ho (2022)
Source. Gospel of Matthew 7:13:
Enter through the narrow gate. For wide is the gate and broad is the road that leads to destruction, and many enter through it.
Ciro Santilli really likes this one, because he feels that where many people go, it is useless to go again. It is much better to explore that which few can do, but which many will need.
- James Legge (1891):
- ctext.org/zhuangzi side by side with Chinese, one chapter per page. Dividies it into three parts:
- Inner Chapters
- Outer Chapters
- Miscellaneous Chapters
- ctext.org/zhuangzi side by side with Chinese, one chapter per page. Dividies it into three parts:
PageRank was apparently inspired by it originally, given that.
They appear to piece together data from various sources. This is the most complete historical domain -> IP database we have so far. They don't have hugely more data than viewdns.info, but many times do offer something new. It feels like the key difference is that their data goes further back in the critical time period a bit.
TODO do they have historical reverse IP? The fact that they don't seem to have it suggests that they are just making historical reverse IP requests to a third party via some API?
E.g. searching
thefilmcentre.com
under historical data at securitytrails.com/domain/thefilmcentre.com/history/al gives the correct IP 62.22.60.55.Account creation blacklists common email providers such as gmail to force users to use a "corporate" email address. But using random domains like
ciro@cirosantilli.com
works fine.Their data seems to date back to 2008 for our searches.
CIA 2010 covert communication websites Expired domain trackers by
Ciro Santilli 37 Updated 2025-07-16
When you Google most of the hit domains, many of them show up on "expired domain trackers", and above all Chinese expired domain trackers for some reason, notably e.g.:This suggests that scraping these lists might be a good starting point to obtaining "all expired domains ever".
- hupo.com: e.g. static.hupo.com/expdomain_myadmin/2012-03-06(国际域名).txt. Heavily IP throttled. Tor hindered more than helped.Scraping script: ../cia-2010-covert-communication-websites/hupo.sh. Scraping does about 1 day every 5 minutes relatively reliably, so about 36 hours / year. Not bad.Results are stored under
tmp/humo/<day>
.Check for hit overlap:The hits are very well distributed amongst days and months, at least they did a good job hiding these potential timing fingerprints. This feels very deliberately designed.grep -Fx -f <( jq -r '.[].host' ../media/cia-2010-covert-communication-websites/hits.json ) cia-2010-covert-communication-websites/tmp/hupo/*
There are lots of hits. The data set is very inclusive. Also we understand that it must have been obtains through means other than Web crawling, since it contains so many of the hits.Some of their files are simply missing however unfortunately, e.g. neither of the following exist:webmasterhome.cn did contain that one however: domain.webmasterhome.cn/com/2012-07-01.asp. Hmm. we might have better luck over there then?2018-11-19 is corrupt in a new and wonderful way, with a bunch of trailing zeros:ends in:wget -O hupo-2018-11-19 'http://static.hupo.com/expdomain_myadmin/2018-11-19%EF%BC%88%E5%9B%BD%E9%99%85%E5%9F%9F%E5%90%8D%EF%BC%89.txt hd hupo-2018-11-19
000ffff0 74 75 64 69 65 73 2e 63 6f 6d 0d 0a 70 31 63 6f |tudies.com..p1co| 00100000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 0018a5e0 00 00 00 00 00 00 00 00 00 |.........|
More generally, several files contain invalid domain names with non-ASCII characters, e.g. 2013-01-02 contains365<D3>л<FA><C2><CC>.com
. Domain names can only contain ASCII charters: stackoverflow.com/questions/1133424/what-are-the-valid-characters-that-can-show-up-in-a-url-host Maybe we should get rid of any such lines as noise.Some files around 2011-09-06 start with an empty line. 2014-01-15 starts with about twenty empty lines. Oh and that last one also has some trash bytes the end<B7><B5><BB><D8>
. Beauty. - webmasterhome.cn: e.g. domain.webmasterhome.cn/com/2012-03-06.asp. Appears to contain the exact same data as "static.hupo.com"Also has some randomly missing dates like hupo.com, though different missing ones from hupo, so they complement each other nicely.Some of the URLs are broken and don't inform that with HTTP status code, they just replace the results with some Chinese text 无法找到该页 (The requested page could not be found):Several URLs just return length 0 content, e.g.:It is not fully clear if this is a throttling mechanism, or if the data is just missing entirely.
curl -vvv http://domain.webmasterhome.cn/com/2015-10-31.asp * Trying 125.90.93.11:80... * Connected to domain.webmasterhome.cn (125.90.93.11) port 80 (#0) > GET /com/2015-10-31.asp HTTP/1.1 > Host: domain.webmasterhome.cn > User-Agent: curl/7.88.1 > Accept: */* > < HTTP/1.1 200 OK < Date: Sat, 21 Oct 2023 15:12:23 GMT < Server: Microsoft-IIS/6.0 < X-Powered-By: ASP.NET < Content-Length: 0 < Content-Type: text/html < Set-Cookie: ASPSESSIONIDCSTTTBAD=BGGPAONBOFKMMFIPMOGGHLMJ; path=/ < Cache-control: private < * Connection #0 to host domain.webmasterhome.cn left intact
Starting around 2018, the IP limiting became very intense, 30 mins / 1 hour per URL, so we just gave up. Therefore, data from 2018 onwards does not contain webmasterhome.cn data.Starting from2013-05-10
the format changes randomly. This also shows us that they just have all the HTML pages as static files on their server. E.g. with:we see:grep -a '<pre' * | s
2013-05-09:<pre style='font-family:Verdana, Arial, Helvetica, sans-serif; '><strong>2013<C4><EA>05<D4><C2>09<C8>յ<BD><C6>ڹ<FA><BC><CA><D3><F2><C3><FB></strong><br>0-3y.com 2013-05-10:<pre><strong>2013<C4><EA>05<D4><C2>10<C8>յ<BD><C6>ڹ<FA><BC><CA><D3><F2><C3><FB></strong>
- justdropped.com: e.g. www.justdropped.com/drops/010112com.html. First known working day:
2006-01-01
. Unthrottled. - yoid.com: e.g.: yoid.com/bydate.php?d=2016-06-03&a=a. First known workding day:
2016-06-01
.
Data comparison:
- 2012-01-01Looking only at the
.com
:The lists are quite similar however.- webmastercn has just about ten extra ones than justdropped, the rest is exactly the same
- justdropped has some extra and some missing from hupo
We've made the following pipelines for hupo.com + webmasterhome.cn merging:
./hupo.sh &
./webmastercn.sh &
./justdropped.sh &
wait
./justdropped-post.sh
./hupo-merge.sh
# Export as small Google indexable files in a Git repository.
./hupo-repo.sh
# Export as per year zips for Internet Archive.
./hupo-zip.sh
# Obtain count statistics:
./hupo-wc.sh
Count unique domains in the repos:
( echo */*/*/* | xargs cat ) | sort -u | wc
The extracted data is present at:Soon after uploading, these repos started getting some interesting traffic, presumably started by security trackers going "bling bling" on certain malicious domain names in their databases:
- archive.org/details/expired-domain-names-by-day
- github.com/cirosantilli/expired-domain-names-by-day-* repos:
- github.com/cirosantilli/expired-domain-names-by-day-2006
- github.com/cirosantilli/expired-domain-names-by-day-2007
- github.com/cirosantilli/expired-domain-names-by-day-2008
- github.com/cirosantilli/expired-domain-names-by-day-2009
- github.com/cirosantilli/expired-domain-names-by-day-2010
- github.com/cirosantilli/expired-domain-names-by-day-2011 (~11M)
- github.com/cirosantilli/expired-domain-names-by-day-2012 (~18M)
- github.com/cirosantilli/expired-domain-names-by-day-2013 (~28M)
- github.com/cirosantilli/expired-domain-names-by-day-2014 (~29M)
- github.com/cirosantilli/expired-domain-names-by-day-2015 (~28M)
- github.com/cirosantilli/expired-domain-names-by-day-2016
- github.com/cirosantilli/expired-domain-names-by-day-2017
- github.com/cirosantilli/expired-domain-names-by-day-2018
- github.com/cirosantilli/expired-domain-names-by-day-2019
- github.com/cirosantilli/expired-domain-names-by-day-2020
- github.com/cirosantilli/expired-domain-names-by-day-2021
- github.com/cirosantilli/expired-domain-names-by-day-2022
- github.com/cirosantilli/expired-domain-names-by-day-2023
- github.com/cirosantilli/expired-domain-names-by-day-2024
- GitHub trackers:
- admin-monitor.shiyue.com
- anquan.didichuxing.com
- app.cloudsek.com
- app.flare.io
- app.rainforest.tech
- app.shadowmap.com
- bo.serenety.xmco.fr 8 1
- bts.linecorp.com
- burn2give.vercel.app
- cbs.ctm360.com 17 2
- code6.d1m.cn
- code6-ops.juzifenqi.com
- codefend.devops.cndatacom.com
- dlp-code.airudder.com
- easm.atrust.sangfor.com
- ec2-34-248-93-242.eu-west-1.compute.amazonaws.com
- ecall.beygoo.me 2 1
- eos.vip.vip.com 1 1
- foradar.baimaohui.net 2 1
- fty.beygoo.me
- hive.telefonica.com.br 2 1
- hulrud.tistory.com
- kartos.enthec.com
- soc.futuoa.com
- lullar-com-3.appspot.com
- penetration.houtai.io 2 1
- platform.sec.corp.qihoo.net
- plus.k8s.onemt.co 4 1
- pmp.beygoo.me 2 1
- portal.protectorg.com
- qa-boss.amh-group.com
- saicmotor.saas.cubesec.cn
- scan.huoban.com
- sec.welab-inc.com
- security.ctrip.com 10 3
- siem-gs.int.black-unique.com 2 1
- soc-github.daojia-inc.com
- spigotmc.org 2 1
- tcallzgroup.blueliv.com
- tcthreatcompass05.blueliv.com 4 1
- tix.testsite.woa.com 2 1
- toucan.belcy.com 1 1
- turbo.gwmdevops.com 18 2
- urlscan.watcherlab.com
- zelenka.guru. Looks like a Russian hacker forum.
- LinkedIn profile views:
- "Information Security Specialist at Forcepoint"
Check for overlap of the merge:
grep -Fx -f <( jq -r '.[].host' ../media/cia-2010-covert-communication-websites/hits.json ) cia-2010-covert-communication-websites/tmp/merge/*
Next, we can start searching by keyword with Wayback Machine CDX scanning with Tor parallelization with out helper ../cia-2010-covert-communication-websites/hupo-cdx-tor.sh, e.g. to check domains that contain the term "news":produces per-year results for the regex term OK lets:
./hupo-cdx-tor.sh mydir 'news|global' 2011 2019
news|global
between the years under:tmp/hupo-cdx-tor/mydir/2011
tmp/hupo-cdx-tor/mydir/2012
./hupo-cdx-tor.sh out 'news|headline|internationali|mondo|mundo|mondi|iran|today'
Other searches that are not dense enough for our patience:
world|global|[^.]info
OMG and a few more. It's amazing.
news
search might be producing some golden, golden new hits!!! Going full into this. Hits:- thepyramidnews.com
- echessnews.com
- tickettonews.com
- airuafricanews.com
- vuvuzelanews.com
- dayenews.com
- newsupdatesite.com
- arabicnewsonline.com
- arabicnewsunfiltered.com
- newsandsportscentral.com
- networkofnews.com
- trekkingtoday.com
- financial-crisis-news.com
CIA 2010 covert communication websites iraniangoals.com JavaScript reverse engineering by
Ciro Santilli 37 Updated 2025-07-16
Some reverse engineering was done at: twitter.com/hackerfantastic/status/1575505438111571969?lang=en.
Notably, the password is hardcoded and its hash is stored in the JavaScript itself. The result is then submitted back via a POST request to
/cgi-bin/goal.cgi
.TODO: how is the SHA calculated? Appears to be manual.
whoisxmlapi WHOIS record on September 13, 2011
- Registrar Name: NETWORK SOLUTIONS, LLC
- Created Date: February 17, 2010 00:00:00 UTC
- Updated Date: February 17, 2010 00:00:00 UTC
- Expires Date: February 17, 2015 00:00:00 UTC
- Registrant Name: See, Megan|ATTN NOTICIASMUSICA.NET|care of Network Solutions
- Registrant Street: PO Box 459
- Registrant City: PA
- Registrant State/Province: US
- Registrant Postal Code: 18222
- Registrant Country: UNITED STATES
- Administrative Contact
- Administrative Name: See, Megan|ATTN NOTICIASMUSICA.NET|care of Network Solutions
- Administrative Street: PO Box 459
- Administrative City: Drums
- Administrative State/Province: PA
- Administrative Postal Code: 18222
- Administrative Country: UNITED STATES
- Administrative Email: hf3eg77c4nn@networksolutionsprivateregistration.com
- Administrative Phone: 5707088780
- Name Servers: NS45.WORLDNIC.COM|NS46.WORLDNIC.COM
2012:
- Registrant Country: PANAMA
Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
Figure 2. You can publish local OurBigBook lightweight markup files to either OurBigBook.com or as a static website.Figure 3. Visual Studio Code extension installation.Figure 5. . You can also edit articles on the Web editor without installing anything locally. Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension. - Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact