This program makes you respect GNU make a bit more. Good old make with
-j
can not only parallelize, but also take in account a dependency graph.Some examples under:
man parallel_exampes
To get the input argument explicitly job number use the magic string sample output:
{}
, e.g.:printf 'a\nb\nc\n' | parallel echo '{}'
a
b
c
To get the job number use sample output:
{#}
as in:printf 'a\nb\nc\n' | parallel echo '{} {#}'
a 1
b 2
c 3
c 3
{%}
contains which thread the job running in, e.g. if we limit it to 2
threads with -j2
:printf 'a\nb\nc\nd\n' | parallel -j2 echo '{} {#} {%}'
a 1 1
b 2 1
c 3 2
d 4 1
%
symbol in many programming languages such as C.To pass multiple CLI arguments per command you can use sample output:
-X
e.g.:printf 'a\nb\nc\nd\n' | parallel -j2 -X echo '{} {#} {%}'
a b 1 1
c d 2 2
CIA 2010 covert communication websites Internet Census 2012 Updated 2025-07-11 +Created 1970-01-01
Does not appear to have any reverse IP hits unfortunately: opendata.stackexchange.com/questions/1951/dataset-of-domain-names/21077#21077. Likely only has domains that were explicitly advertised.
We could not find anything useful in it so far, but there is great potential to use this tool to find new IP ranges based on properties of existing IP ranges. Part of the problem is that the dataset is huge, and is split by top 256 bytes. But it would be reasonable to at least explore ranges with pre-existing known hits...
We have started looking for patterns on
66.*
and 208.*
, both selected as two relatively far away ranges that have a number of pre-existing hits. 208 should likely have been 212 considering later finds that put several ranges in 212.tcpip_fp:
- 66.104.
- 66.104.175.41: grubbersworldrugbynews.com: 1346397300 SCAN(V=6.01%E=4%D=1/12%OT=22%CT=443%CU=%PV=N%G=N%TM=387CAB9E%P=mipsel-openwrt-linux-gnu),ECN(R=N),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=N),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.104.175.48: worlddispatch.net: 1346816700 SCAN(V=6.01%E=4%D=1/2%OT=22%CT=443%CU=%PV=N%DC=I%G=N%TM=1D5EA%P=mipsel-openwrt-linux-gnu),SEQ(SP=F8%GCD=3%ISR=109%TI=Z%TS=A),ECN(R=N),T1(R=Y%DF=Y%TG=40%S=O%A=S+%F=AS%RD=0%Q=),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.104.175.49: webworldsports.com: 1346692500 SCAN(V=6.01%E=4%D=9/3%OT=22%CT=443%CU=%PV=N%DC=I%G=N%TM=5044E96E%P=mipsel-openwrt-linux-gnu),SEQ(SP=105%GCD=1%ISR=108%TI=Z%TS=A),OPS(O1=M550ST11NW6%O2=M550ST11NW6%O3=M550NNT11NW6%O4=M550ST11NW6%O5=M550ST11NW6%O6=M550ST11),WIN(W1=1510%W2=1510%W3=1510%W4=1510%W5=1510%W6=1510),ECN(R=N),T1(R=Y%DF=Y%TG=40%S=O%A=S+%F=AS%RD=0%Q=),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.104.175.50: fly-bybirdies.com: 1346822100 SCAN(V=6.01%E=4%D=1/1%OT=22%CT=443%CU=%PV=N%DC=I%G=N%TM=14655%P=mipsel-openwrt-linux-gnu),SEQ(TI=Z%TS=A),ECN(R=N),T1(R=Y%DF=Y%TG=40%S=O%A=S+%F=AS%RD=0%Q=),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.104.175.53: info-ology.net: 1346712300 SCAN(V=6.01%E=4%D=9/4%OT=22%CT=443%CU=%PV=N%DC=I%G=N%TM=50453230%P=mipsel-openwrt-linux-gnu),SEQ(SP=FB%GCD=1%ISR=FF%TI=Z%TS=A),ECN(R=N),T1(R=Y%DF=Y%TG=40%S=O%A=S+%F=AS%RD=0%Q=),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.175.106
- 66.175.106.150: noticiasmusica.net: 1340077500 SCAN(V=5.51%D=1/3%OT=22%CT=443%CU=%PV=N%G=N%TM=38707542%P=mipsel-openwrt-linux-gnu),ECN(R=N),T1(R=N),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
- 66.175.106.155: atomworldnews.com: 1345562100 SCAN(V=5.51%D=8/21%OT=22%CT=443%CU=%PV=N%DC=I%G=N%TM=5033A5F2%P=mips-openwrt-linux-gnu),SEQ(SP=FB%GCD=1%ISR=FC%TI=Z%TS=A),ECN(R=Y%DF=Y%TG=40%W=1540%O=M550NNSNW6%CC=N%Q=),T1(R=Y%DF=Y%TG=40%S=O%A=S+%F=AS%RD=0%Q=),T2(R=N),T3(R=N),T4(R=N),T5(R=Y%DF=Y%TG=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=),T6(R=N),T7(R=N),U1(R=N),IE(R=N)
alljohnny.com
had a hit: ipinf.ru/domains/alljohnny.com/, and so Ciro started looking around... and a good number of other things have hits.Not all of them, definitely less data than viewdns.info.
But they do reverse IP, and they show which nearby reverse IPs have hits on the same page, for free, which is great!
Shame their ordering is purely alphabetical, doesn't properly order the IPs so it is a bit of a pain, but we can handle it.
OMG, Russians!!!
CIA 2010 covert communication websites iraniangoalkicks.com Updated 2025-07-11 +Created 1970-01-01
whoisxmlapi WHOIS history March 23, 2011:
whoisrequest.com/history/ mentions:
1 May, 2007: Domain created*, nameservers added. Nameservers:
1 May, 2007: Domain created*, nameservers added. Nameservers:
- ns1.qwknetllc.com
- ns2.qwknetllc.com
whoisxmlapi WHOIS history April 11, 2011:Folowed by reuters registration in 2022.
- Created Date: March 6, 2008 00:00:00 UTC
- Updated Date: March 7, 2011 00:00:00 UTC
- Expires Date: March 6, 2014 00:00:00 UTC
- Registrant Name: domainsbyproxy.com.
- Registrant Organization: Domains by Proxy, Inc.
- Registrant Street: 15111 N. Hayden Rd., Ste 160,
- Registrant City: Scottsdale
- Registrant State/Province: Arizona
- Registrant Postal Code: 85260
- Registrant Country: UNITED STATES
- Name servers: NS29.WORLDNIC.COM|NS30.WORLDNIC.COM
whoisrequest.com/history/ mentions:
- 1 Apr, 2008: Domain created*, nameservers added. Nameservers:
- ns1.webhostingpad.com
- ns2.webhostingpad.com
CIA 2010 covert communication websites iraniangoals.com JavaScript reverse engineering Updated 2025-07-11 +Created 1970-01-01
Some reverse engineering was done at: twitter.com/hackerfantastic/status/1575505438111571969?lang=en.
Notably, the password is hardcoded and its hash is stored in the JavaScript itself. The result is then submitted back via a POST request to
/cgi-bin/goal.cgi
.TODO: how is the SHA calculated? Appears to be manual.
CIA 2010 covert communication websites JavaScript reverse engineering Updated 2025-07-11 +Created 1970-01-01
CIA 2010 covert communication websites JavaScript with SHAs Updated 2025-07-11 +Created 1970-01-01
There are two types of JavaScript found so far. The ones with SHA and the ones without. There are only 2 examples of JS with SHA:Both files start with precisely the same string:
- iraniangoals.com: web.archive.org/web/20110202091909/http://iraniangoals.com/journal.js Commented at: iraniangoals.com JavaScript reverse engineering
- iranfootballsource.com: web.archive.org/web/20110202091901/http://iranfootballsource.com/futbol.js
- kukrinews.com: web.archive.org/web/20100513094909/http://kukrinews.com/news.js
- todaysnewsandweather-ru.com: web.archive.org/web/20110207094735/http://todaysnewsandweather-ru.com/blacksea.js
var ms="\u062F\u0631\u064A\u0627\u0641\u062A\u06CC",lc="\u062A\u0647\u064A\u0647 \u0645\u062A\u0646",mn="\u0628\u0631\u062F\u0627\u0632\u0634 \u062F\u0631 \u062C\u0631\u064A\u0627\u0646 \u0627\u0633\u062A...\u0644\u0637\u0641\u0627 \u0635\u0628\u0631 \u0643\u0646\u064A\u062F",lt="\u062A\u0647\u064A\u0647 \u0645\u062A\u0646",ne="\u067E\u0627\u0633\u062E",kf="\u062E\u0631\u0648\u062C",mb="\u062D\u0630\u0641",mv="\u062F\u0631\u064A\u0627\u0641\u062A\u06CC",nt="\u0627\u0631\u0633\u0627\u0644",ig="\u062B\u0628\u062A \u063A\u0644\u0637. \u062C\u0647\u062A \u062A\u062C\u062F\u064A\u062F \u062B\u0628\u062A \u0635\u0641\u062D\u0647 \u0631\u0627 \u0628\u0627\u0632\u0622\u0648\u0631\u06CC \u06A9\u0646\u064A\u062F",hs="\u063A\u064A\u0631 \u0642\u0627\u0628\u0644 \u0627\u062C\u0631\u0627. \u062E\u0637\u0627 \u062F\u0631 \u0627\u062A\u0651\u0635\u0627\u0644",ji="\u063A\u064A\u0631 \u0642\u0627\u0628\u0644 \u0627\u062C\u0631\u0627. \u062E\u0637\u0627 \u062F\u0631 \u0627\u062A\u0651\u0635\u0627\u0644",ie="\u063A\u064A\u0631 \u0642\u0627\u0628\u0644 \u0627\u062C\u0631\u0627. \u062E\u0637\u0627 \u062F\u0631 \u0627\u062A\u0651\u0635\u0627\u0644",gc="\u0633\u0648\u0627\u0631 \u06A9\u0631\u062F\u0646 \u062A\u06A9\u0645\u064A\u0644 \u0634\u062F",gz="\u0645\u0637\u0645\u0626\u0646\u064A\u062F \u06A9\u0647 \u0645\u064A\u062E\u0648\u0627\u0647\u064A\u062F \u067E\u064A\u0627\u0645 \u0631\u0627 \u062D\u0630\u0641 \u06A9\u0646\u064A\u062F\u061F"
Good fingerprint present in all of them:
throw new Error("B64 D.1");};if(at[1]==-1){throw new Error("B64 D.2");};if(at[2]==-1){if(f<ay.length){throw new Error("B64 D.3");};dg=2;}else if(at[3]==-1){if(f<ay.length){throw new Error("B64 D.4")
Tends to be Ciro Santilli's first attempt for quick and dirty graphing: github.com/cirosantilli/gnuplot-cheat.
When it doesn't, you Google for an hours, and then you give up in frustration, and fall back to Matplotlib.
Couldn't handle exploration of large datasets though: Survey of open source interactive plotting software with a 10 million point scatter plot benchmark by Ciro Santilli
JAR, SWF and CGI-bin scanning by path only is fine, since there are relatively few of those. But .js scanning by path only is too broad.
One option would be to filter out by size, an information that is contained on the CDX. Let's check typical ones:Ignoring some obvious unrelated non-comms files visually we get a range of about 2732 to 3632:This ignores the obviously atypical JavaScript with SHAs from iranfootballsource, and the particularly small old menu.js from cutabovenews.com, which we embed into ../cia-2010-covert-communication-websites/cdx-post-js.sh.
grep -f <(jq -r '.[]|select(select(.comms)|.comms|test("\\.js"))|.host' ../media/cia-2010-covert-communication-websites/hits.json) out | out.jshits.cdx
sort -n -k7 out.jshits.cdx
net,hollywoodscreen)/current.js 20110106082232 http://hollywoodscreen.net/current.js text/javascript 200 XY5NHVW7UMFS3WSKPXLOQ5DJA34POXMV 2732
com,amishkanews)/amishkanewss.js 20110208032713 http://amishkanews.com/amishkanewss.js text/javascript 200 S5ZWJ53JFSLUSJVXBBA3NBJXNYLNCI4E 3632
CIA 2010 covert communication websites Oleg Shakirov's findings Updated 2025-07-11 +Created 1970-01-01
Starting at twitter.com/shakirov2036/status/1746729471778988499, Russian expat Oleg Shakirov comments "Let me know if you are still looking for the Carson website".
He then proceeded to give Carson and 5 other domains in private communication. His name is given here with his consent. His advances besides not being blind were Yandexing for some of the known hits which led to pages that contained other hits:
- moyistochnikonlaynovykhigr.com contains a copy of myonlinegamesource.com, and both are present at www.seomastering.com/audit/pefl.ru/, an SEO tracker, because both have backlinks to
pefl.ru
, which is apparently a niche fantasy football website - 4 previously unknown hits from: "Mass Deface III" pastebin. He missed one which Ciro then found after inspecting all URLs on Wayback Machine, so leading to a total of 5 new hits from that source.
CIA 2010 covert communication websites securitytrails.com Updated 2025-07-11 +Created 1970-01-01
They appear to piece together data from various sources. This is the most complete historical domain -> IP database we have so far. They don't have hugely more data than viewdns.info, but many times do offer something new. It feels like the key difference is that their data goes further back in the critical time period a bit.
TODO do they have historical reverse IP? The fact that they don't seem to have it suggests that they are just making historical reverse IP requests to a third party via some API?
E.g. searching
thefilmcentre.com
under historical data at securitytrails.com/domain/thefilmcentre.com/history/al gives the correct IP 62.22.60.55.Account creation blacklists common email providers such as gmail to force users to use a "corporate" email address. But using random domains like
ciro@cirosantilli.com
works fine.Their data seems to date back to 2008 for our searches.
The CGI comms websites contain the only occurrence of HTTPS, so it might open up the door for a certificate fingerprint as proposed by user joelcollinsdc at: news.ycombinator.com/item?id=36280801!
crt.sh appears to be a good way to look into this:They all appear to use either of:
- backstage.musical-fortune.net:
- clients.smart-travel-consultant.com
- members.it-proonline.com
- members.metanewsdaily.com
- miembros.todosperuahora.com
- secure.altworldnews.com
- secure.driversinternationalgolf.com
- secure.freshtechonline.com
- secure.globalnewsbulletin.com
- secure.negativeaperture.com
- secure.riskandrewardnews.com
- secure.theworld-news.net
- secure.topbillingsite.com
- secure.worldnewsandent.com
- ssl.beyondnetworknews.com
- ssl.newtechfrontier.com
- www.businessexchangetoday.com
- heal.conquermstoday.com
- Go Daddy
- Thawte DV SSL CA
- Starfield Technologies, Inc.
crt.sh/?q=globalnewsbulletin.com has a hit to: crt.sh/?id=774803. With login we can see: search.censys.io/certificates/5078bce356a8f8590205ae45350b27f58f4ac04478ed47a389a55b539065cee8. Issued by www.thawte.com/repository/index.html. No hits for certificates with same public key: search.censys.io/search?resource=certificates&q=parsed.subject_key_info.fingerprint_sha256%3A+714b4a3e8b2f555d230a92c943ced4f34b709b39ed590a6a230e520c273705af or any other "same" queries though.
Let's try another one for secure.altworldnews.com: search.censys.io/certificates/e88f8db87414401fd00728db39a7698d874dbe1ae9d88b01c675105fabf69b94. Nope, no direct mega hits here either.
Their historic DNS and reverse DNS info was very valuable, and served as Ciro's the initial entry point to finding hits in the IP ranges given by Reuters.
Generic information about the website not specific on this project will be stored at: Section "viewdns.info".
Since this source is so scarce and valuable, we have been quite careful to note down all the domain and IP ranges that have been explored.
At news.ycombinator.com/item?id=38496244, the creator of the viewdns.info, "Hughesey", also stated that he'd able to give some free credits for public research projects such as this one. This would have saved up going to quite a few Cafes to get those sweet extra IPs! But it was more fun in hardmode, no doubt.
We do API access to IP ranges with this simple helper: ../cia-2010-covert-communication-websites/viewdns-info.sh, usage:e.g.:
./viewdns-info.sh <apikey> <start-ipv-address> <end-ipv-address>
./viewdns-info.sh 8b890b00b17ed2d66bbed878d51200b58d43d014 66.45.179.187 66.45.179.210
For domain to IP queries from the API you should use "iphistory" viewdns.info/api/docs/ip-history.php:
curl 'https://api.viewdns.info/iphistory/?domain=todaysengineering.com&apikey=$APIKEY&output=json'
Just beware of the viewdns.info reverse IP bug, that really sucks and led to us missing a ton of domains.
D'oh.
But to be serious. The Wayback Machine contains a very large proportion of all sites. It does happen sometime that a Wayback Machine archive is missing or broken and cqcounter has the screenshot. But the Wayback Machine is still the most complete database we have found so far. Some archives are very broken. But those are rare.
The only problem with the Wayback Machine is that there is no known efficient way to query its archives across domains. You have to have a domain in hand for CDX queries: Wayback Machine CDX scanning.
The Common Crawl project attempts in part to address this lack of querriability, but we haven't managed to extract any hits from it.
CDX + 2013 DNS Census + heuristics however has been fruitful however.
We have dumped all Wayback Machine archives of known websites to: github.com/cirosantilli/cia-2010-websites-dump using ../cia-2010-covert-communication-websites/download-websites.sh. This allows for better grepping and serves as a backup in case they ever go down.
CIA 2010 covert communication websites Wayback Machine CDX scanning Updated 2025-07-11 +Created 1970-01-01
The Wayback Machine has an endpoint to query cralwed pages called the CDX server. It is documented at: github.com/internetarchive/wayback/blob/master/wayback-cdx-server/README.md.
This allows to filter down 10 thousands of possible domains in a few hours. But 100s of thousands would be too much. This is because you have to query exactly one URL at a time, and they possibly rate limit IPs. But no IP blacklisting so far after several hours, so it's not that bad.
Once you have a heuristic to narrow down some domains, you can use this helper: ../cia-2010-covert-communication-websites/cdx.sh to drill them down from 10s of thousands down to hundreds or thousands.
We then post process the results of cdx.sh with ../cia-2010-covert-communication-websites/cdx-post.sh to drill them down from from thousands to dozens, and manually inspect everything.
From then on, you can just manually inspect for hist on your browser.
CIA 2010 covert communication websites Wayback Machine CDX scanning with Tor parallelization Updated 2025-07-11 +Created 1970-01-01
Dire times require dire methods: ../cia-2010-covert-communication-websites/cdx-tor.sh.
First we must start the tor servers with the and then use it on a newline separated domain name list to check;This creates a directory
tor-army
command from: stackoverflow.com/questions/14321214/how-to-run-multiple-tor-processes-at-once-with-different-exit-ips/76749983#76749983tor-army 100
./cdx-tor.sh infile.txt
infile.txt.cdx/
containing:infile.txt.cdx/out00
,out01
, etc.: the suspected CDX lines from domains from each tor instance based on the simple criteria that the CDX can handle directly. We split the input domains into 100 piles, and give one selected pile per tor instance.infile.txt.cdx/out
: the final combined CDX output ofout00
,out01
, ...infile.txt.cdx/out.post
: the final output containing only domain names that match further CLI criteria that cannot be easily encoded on the CDX query. This is the cleanest domain name list you should look into at the end basically.
Since archive is so abysmal in its data access, e.g. a Google BigQuery would solve our issues in seconds, we have to come up with creative ways of getting around their IP throttling.
Distilled into an answer at: stackoverflow.com/questions/14321214/how-to-run-multiple-tor-processes-at-once-with-different-exit-ips/76749983#76749983
This should allow a full sweep of the 4.5M records in 2013 DNS Census virtual host cleanup in a reasonable amount of time. After JAR/SWF/CGI filtering we obtained 5.8k domains, so a reduction factor of about 1 million with likely very few losses. Not bad.
5.8k is still a bit annoying to fully go over however, so we can also try to count CDX hits to the domains and remove anything with too many hits, since the CIA websites basically have very few archives:This gives us something like:sorted by increasing hit counts, so we can go down as far as patience allows for!
cd 2013-dns-census-a-novirt-domains.txt.cdx
./cdx-tor.sh -d out.post domain-list.txt
cd out.post.cdx
cut -d' ' -f1 out | uniq -c | sort -k1 -n | awk 'match($2, /([^,]+),([^)]+)/, a) {printf("%s.%s %d\n", a[2], a[1], $1)}' > out.count
12654montana.com 1
aeronet-news.com 1
atohms.com 1
av3net.com 1
beechstreetas400.com 1
The CIA really likes this registrar, e.g.:
- CIA 2010 covert communication websites
- 2014 www.newsweek.com/former-cia-officials-ready-defend-agency-after-torture-reports-release-290383A group of former CIA officials are gearing up to defend the agency when the Senate releases its long-awaited report investigating "enhanced interrogation" tactics used on prisoners after 9/11. The highlight of their PR push will be a website, "CIASAVEDLIVES.COM," which is set to go live when the report is released on Tuesday, Foreign Policy reported.
Design software for synthetic biological circuit.
The input is in Verilog! Overkill?
There are unlisted articles, also show them or only show them.