CIA 2010 covert communication websites / 2013 DNS census MX records Updated +Created
Let' see if there's anything in records/mx.xz.
mx.csv is 21GB.
They do have " in the files to escape commas so:
mx.py
import csv
import sys
writer = csv.writer(sys.stdout)
with open('mx.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        writer.writerow([row[0], row[3]])
Would have been better with csvkit: stackoverflow.com/questions/36287982/bash-parse-csv-with-quotes-commas-and-newlines
then:
# uniq not amazing as there are often two or three slightly different records repeated on multiple timestamps, but down to 11 GB
python3 mx.py | uniq > mx-uniq.csv
sqlite3 mx.sqlite 'create table t(d text, m text)'
# 13 GB
time sqlite3 mx.sqlite ".import --csv --skip 1 'mx-uniq.csv' t"

# 41 GB
time sqlite3 mx.sqlite 'create index td on t(d)'
time sqlite3 mx.sqlite 'create index tm on t(m)'
time sqlite3 mx.sqlite 'create index tdm on t(d, m)'

# Remove dupes.
# Rows: 150m
time sqlite3 mx.sqlite <<EOF
delete from t
where rowid not in (
  select min(rowid)
  from t
  group by d, m
)
EOF

# 15 GB
time sqlite3 mx.sqlite vacuum
Let's see what the hits use:
awk -F, 'NR>1{ print $2 }' ../media/cia-2010-covert-communication-websites/hits.csv | xargs -I{} sqlite3 mx.sqlite "select distinct * from t where d = '{}'"
At around 267 total hits, only 84 have MX records, and from those that do, almost all of them have exactly:
smtp.secureserver.net
mailstore1.secureserver.net
with only three exceptions:
dailynewsandsports.com|dailynewsandsports.com
inews-today.com|mail.inews-today.com
just-kidding-news.com|just-kidding-news.com
We need to count out of the totals!
sqlite3 mx.sqlite "select count(*) from t where m = 'mailstore1.secureserver.net'"
which gives, ~18M, so nope, it is too much by itself...
Let's try to use that to reduce av.sqlite from 2013 DNS Census virtual host cleanup a bit further:
time sqlite3 mx.sqlite '.mode csv' "attach 'aiddcu.sqlite' as 'av'" '.load ./ip' "select ipi2s(av.t.i), av.t.d from av.t inner join t as mx on av.t.d = mx.d and mx.m = 'mailstore1.secureserver.net' order by av.t.i asc" > avm.csv
where avm stands for av with mx pruning. This leaves us with only ~500k entries left. With one more figerprint we could do a Wayback Machine CDX scanning scan.
Let's check that we still have most our hits in there:
grep -f <(awk -F, 'NR>1{print $2}' /home/ciro/bak/git/media/cia-2010-covert-communication-websites/hits.csv) avm.csv
At 267 hits we got 81, so all are still present.
secureserver is a hosting provider, we can see their blank page e.g. at: web.archive.org/web/20110128152204/http://emmano.com/. security.stackexchange.com/questions/12610/why-did-secureserver-net-godaddy-access-my-gmail-account/12616#12616 comments:
secureserver.net is the name GoDaddy use as the reverse DNS for IP addresses used for dedicated/virtual server hosting
We intersect 2013 DNS Census virtual host cleanup with 2013 DNS census MX records and that leaves 460k hits. We did lose a third on the the MX records as of 260 hits since secureserver.net is only used in 1/3 of sites, but we also concentrate 9x, so it may be worth it.
Then we Wayback Machine CDX scanning. it takes about 5 days, but it is manageale.
We did a full Wayback Machine CDX scanning for JAR, SWF and cgi-bin in those, but only found a single new hit:
CIA 2010 covert communication websites / Hits with nearby IP hits Updated +Created
62.22.60.49: telecom-headlines.com. UUNET in Spain. Found with: visual inspection of full 2013 DNS Census virtual host cleanup list just before worldnewsnetworking.com. Tested viewdns.info range: 62.22.60.34 - 62.22.60.66
  • 62.22.60.33: newsperk.com. Almost certainly a hit. Stylistically perfect, rss-item. But no comms not found. Ennerving! 2011. English. Egypt. news. Later legitimately reused.
  • 62.22.60.34: freeslideshow.net. Legit? Attempting to open any HTML archives leads to an infinite page load loop, e.g. 2010. A subpage however exists: web.archive.org/web/20101230001640/http://freeslideshow.net/index_files/a.htm and appears legit.
  • 62.22.60.40: travel-passage.com. Hit.
  • 62.22.60.42: newsupdatesite.com. Hit.
  • 62.22.60.46: flyingtimeline.com. Hit.
  • 62.22.60.47: globalemergenceadvisorsbkserver.com. Legit.
  • 62.22.60.48: currentcommunique.com. Hit.
  • 62.22.60.49: telecom-headlines.com. Hit.
  • 62.22.60.52: collectedmedias.com. Hit.
  • 62.22.60.54: romulusactualites.com. Hit.
  • 62.22.60.55: thefilmcentre.com. Hit.
  • 62.22.60.56: traveltimenews.com. Hit.
62.22.61.206 worldnewsnetworking.com. UUNET in Spain. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 62.22.61.188 - 62.22.61.224
65.218.91.17 alljohnny.com. UUNET in United States. One of the Reuters websites.
63.131.229.12 cyberreportagenews.com. ADHOST in Coeur d'Alene - United States. Tested viewdns.info range: 63.131.228.248 - 63.131.229.30
  • 63.131.229.2: fightskillsresource.com. Hit
  • 63.131.229.4: unitedterritorynews.com. Hit
  • 63.131.229.9: show-dustry.com. Hit
  • 63.131.229.10: afghanpoetry.net. Hit. Also at 74.254.12.166 in another range.
  • 63.131.229.11: mythriftytrip.com. Hit
  • 63.131.229.12: cyberreportagenews.com. Hit.
  • 63.131.229.13: sunrise-news.com. Hit.
  • 63.131.229.15: cricketnewsforindia.com. Hit.
  • 63.131.229.16:
  • 63.131.229.18: itnl-xchange.com. Hit.
  • 63.131.229.20:
    • fixashion.net. Hit.
    • a few others
63.130.160.50 theglobalheadlines.com. CW Vodafone Group PLC in United States. Found with: 2013 DNS census secureserver.net MX records intersection 2013 DNS Census virtual host cleanup. Tested viewdns.info range: 63.130.160.35 - 63.130.160.75
64.16.204.55 holein1news.com. Saudi Telecom Company JSC in Saudi Arabia. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 64.16.204.50 - 64.16.204.63. With did Wayback Machine have so few archives here? TODO stopping viewdns.info exploration a bit short due to that.
65.61.127.163 capture-nature.com. ADHOST in Greenacres - United States. whois.arin.net/rest/net/NET-65-61-96-0-1/pft?s=65.61.127.163: Net Range: 65.61.96.0 - 65.61.127.255. Organization. Name: TierPoint, LLC. Tested viewdns.info range: 65.61.127.149 -
  • 65.61.127.46: anahuacchamber.com 2012-12-22T14:59:01
  • 65.61.127.117: medicaresupplementalinsurance.com, 2013-08-21T09:49:41. Legit.
  • 65.61.127.121: counter-images.com 2013-08-22T11:14:44: web.archive.org/web/20110208173132/http://www.counter-images.com/ Empty.
  • 65.61.127.125 zaphound.com 2013-08-21T02:25:40. Legit.
  • 65.61.127.130: ambitions.org 2013-08-22T01:43:40. Legit.
  • 65.61.127.161: european-footballer.com. Hit.
  • 65.61.127.163: capture-nature.com. Hit.
  • 65.61.127.164: futbolistico.net. 2012-02-20T03:25:33. Legit. web.archive.org/web/20130509004058/http://futbolistico.net/
  • 65.61.127.165: travelconnectionsonline.com. Ciro initially though this might be a hit. But upon Googling it, there's now a mirror at: travelconn.tripod.com/. Combined with the lack of a standard communications mechanism and the 2001 copyright, maybe it isn't a hit after all
  • 65.61.127.166: globalnewsbulletin.com: Hit.
  • 65.61.127.167: internationalwhiskylounge.com. Hit.
  • 65.61.127.168: the-golden-rule.info 2013-09-20T02:13:52. Hit.
  • 65.61.127.169: crossovernews.net. Hit.
  • 65.61.127.170: newsidori.com. Hit.
  • 65.61.127.171: nrgconsultingandnews.com. Hit. 2013-08-13T18:45:05
  • 65.61.127.172: premierstriker.com. Hit. 2012-01-11
  • 65.61.127.174: dedrickonline.com. Hit.
  • 65.61.127.175: altworldnews.com. Hit.
  • 65.61.127.176: american-historyonline.com. Hit. 2011-09-08
  • 65.61.127.177: material-science.org. Hit.
  • 65.61.127.178: tee-shot.net. Hit.
  • 65.61.127.180: screencentral.info. Hit.
  • 65.61.127.181: worldnewsandtravel.com. Hit. 2011-11-13
  • 65.61.127.182: pangawana.com. Hit.
  • 65.61.127.183: cutabovenews.com. Hit.
  • 65.61.127.184: worldwildlifeadventure.com. Hit.
  • 65.61.127.186: explorealtmeds.com. Hit.
  • 65.61.127.194: 16 domains, so unclear.
  • 65.61.127.200: cdl-link.com (ipinf.ru). Legit.
  • 65.61.127.222: asianwhitecoffee.com 2012-07-16T09:21:05 web.archive.org/web/20110903080036/http://asianwhitecoffee.com/. Could be legit.
66.45.179.205 noticiasporjanua.com. ADHOST in Edmonds - United States. Found with: 2013 DNS Census virtual host cleanup. Tested viewdns.info range: 66.45.179.187 - 66.45.179.223
  • 66.45.179.187: mail03.gatesfoundation.org. Legit.
  • 66.45.179.192: thegraceofislam.com. Hit.
  • 66.45.179.193: arabicnewsunfiltered.com. Hit.
  • 66.45.179.194: raulsonsglobalnews.com. Hit.
  • 66.45.179.195: aryannews.net. Hit.
  • 66.45.179.199: attivitaestremi.com. Hit.
  • 66.45.179.200: foodwineandsuch.com. Hit.
  • 66.45.179.201: hitthepavementnow.com. Hit.
  • 66.45.179.203: noticiascontinental.com. Hit.
  • 66.45.179.205: noticiasporjanua.com. Hit.
  • 66.45.179.206: podisticamondiale.com. Hit.
  • 66.45.179.207: reflectordenoticias.com. Hit.
  • 66.45.179.208: havenofgamerz.com. Hit.
  • 66.45.179.209: vejaaeuropa.com. Hit.
  • 66.45.179.210: sa-michigan.com. Hit.
  • 66.45.179.211: absolutebearing.net. Hit.
  • 66.45.179.212: grandretirement.net. No archives. cqcounter.com/whois/www/grandretirement.net.html blank image.
  • 66.45.179.213: myportaltonews.com. Hit.
  • 66.45.179.214: investmentintellect.com. Hit.
  • 66.45.179.215: nigeriastar.net 2012-03-12. Hit.
66.104.169.184 bcenews.com. XO-AS15 in United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 66.104.169.158 - 66.104.169.189
66.104.173.186 myworldlymusic.com. XO-AS15 in United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 66.104.173.158 - 66.104.173.194
66.104.175.40 beyondnetworknews.com. XO-AS15 in United States. whois.arin.net/rest/net/NET-66-104-0-0-1/pft?s=66.104.175.40. Net Range:66.104.0.0 - 66.107.255.255. 2012 Internet Census puts most/all hits in this range under ip66-104-175-34.z175-104-66.customer.algx.net, algx.net redirects to verizon.com as of 2023. Related: superuser.com/questions/956568/why-are-my-pings-going-to-customer-algx-net. Tested viewdns.info range: 66.104.175.24 - unknown
66.175.106.148 activegaminginfo.com. UUNET in United States. whois.arin.net/rest/net/NET-66-175-106-128-1/pft?s=66.175.106.148: Net Range: 66.175.106.128 - 66.175.106.159. Customer Name: DIAMOND-COLESON. Tested viewdns.info range: 66.175.106.131 - 66.175.106.178
66.237.236.247 comunidaddenoticias.com. XO-AS15 in United States. Tested viewdns.info range: 66.237.236.222 - 66.237.236.254
69.84.156.90 stickshiftnews.com. COLOSPACE in Methuen - United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 69.84.156.64 - 69.84.156.95
  • 69.84.156.69: al-ashak-news-me.com. Hit.
  • 69.84.156.70: theventurenews.info. Hit.
  • 69.84.156.71: worldfinancetoday.net. Hit.
  • 69.84.156.72: autonewsarabia.com. Hit.
  • 69.84.156.74: blue-moon-news.com. Hit.
  • 69.84.156.75: theoutergreen.com. No archives. Might have been another golf hit. cqcounter.com/whois/www/theoutergreen.com.html not found.
  • 69.84.156.76: tnc-urdu.com. Hit.
  • 69.84.156.79: jassimnews.com. No archives/broken. cqcounter.com/whois/www/jassimnews.com.html blank.
  • 69.84.156.80: noticiasdenuestromundo.com. Hit.
  • 69.84.156.82: arabicnewsonline.com. Hit.
  • 69.84.156.83: unganadormundial.com. Hit.
  • 69.84.156.84: focusonbokeh.com. Hit. Network Solutions, LLC.
  • 69.84.156.85: classic-rocktopia.com. Hit. domainsbyproxy.com.
  • 69.84.156.87: i7diver.com. Hit.
  • 69.84.156.88: diariodeelmundo.com. Hit.
  • 69.84.156.89: todaysarabnews.com. Hit.
  • 69.84.156.90: stickshiftnews.com. Hit.
  • 69.84.156.91: theinternationalgoal.com. Hit.
72.34.53.174 technologytodayandtomorrow.com. IHNET in United States. This IP is special. This IP is somehow closely linked to the "Mass Deface III" pastebin as it seems to have been hosted by Condor hosting. They also have many old sites, and links to Russia which is apparently where this was hosted.
74.116.72.236 techtopnews.com. OPTIMUM-WIFI2 in Brooklyn - United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 74.116.72.215 - 74.116.72.254
74.254.12.168 non-stop-news.net. BELLSOUTH-NET-BLK in Atlantic Beach - United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 74.254.12.158 - 74.254.12.195. This domain exceptionally also has a second IP also with multihits: 207.239.196.230. The fact that the range has rdns sources with hits from both 2013 DNS Census and viewdns.info suggests this range is correct.
173.208.81.2 LEASEWEB-USA-CHI in Lombard - United States:
199.85.212.118 just-kidding-news.com. ATT-INTERNET4 in United States.
204.176.38.143 noticiassofisticadas.com. UUNET in United States. Found with: 2013 DNS Census virtual host cleanup. Tested viewdns.info range: 204.176.38.125 - 204.176.38.154
  • 204.176.38.130: i-pressnews.com. Hit.
  • 204.176.38.132: turkishnewslinks.com. Hit.
  • 204.176.38.134: photographyarecord.com. Hit.
  • 204.176.38.135: breakingthewicket.com. Hit.
  • 204.176.38.136: politicalworldtoday.com. Hit.
  • 204.176.38.137: hi-tech-today.com. Hit.
  • 204.176.38.138: continental-business-news.com. TODO. rss-item, split images. 2011. Cannot find comms. Also header and footer are not limited width which is unusual. Further HTML similarity reversing would be needed.
  • 204.176.38.139: bigscreenbattles.com. Hit.
  • 204.176.38.141: rakotafootball.com. Hit.
  • 204.176.38.142: senderosdemontana.com. Hit.
  • 204.176.38.143: noticiassofisticadas.com. Hit.
  • 204.176.38.144: techno-today.com. Hit.
  • 204.176.38.145: tickettonews.com. Hit.
  • 204.176.38.146: dps-digitalphotosharing.com. Hit.
  • 204.176.38.147: theputtingreen.com. Hit.
  • 204.176.38.149: sportsnewstodayar.com. Hit.
  • 204.176.38.150: kairuafricanews.com. Hit.
204.176.39.115 globalprovincesnews.com. UUNET in United States. Tested viewdns.info range: 204.176.39.93 - 204.176.39.124
207.150.191.68 technologypresstoday.com. Saudi Telecom Company JSC in Saudi Arabia.
207.210.250.132 aeronet-news.com. AS17378 in United States. This is the Autonomous System Number for TierPoint, LLC. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 207.210.250.126 - 207.210.250.157
  • 207.210.250.131: starrynightnews.com. Hit.
  • 207.210.250.132: aeronet-news.com. Hit.
  • 207.210.250.133: bakaribulletin.com. Hit.
  • 207.210.250.134: deprensaenlarevisiondehoy.com. Hit.
  • 207.210.250.135: icwb-news.com. Hit.
  • 207.210.250.136: sportsreelhighlights.com. Hit.
  • 207.210.250.137: fashionforward.info. No archives. cqcounter.com/whois/www/fashionforward.info.html innovative but has a "Member" section. Stock lady visible somwhere at westlahairgrowth.com/?page_id=12158 according to Google images but I couldn't find it easily in the page.
  • 207.210.250.138: inquiry-human-past.com. Hit.
  • 207.210.250.139: thefairwaysaregreen.com. Hit.
  • 207.210.250.142: russiaupdate.com. Hit.
  • 207.210.250.143: archaeologyreview.net. Hit.
  • 207.210.250.144: highspeed-news.com. No archives. cqcounter.com/whois/www/highspeed-news.com.html not found.
  • 207.210.250.146: noticias-caracas.com. Hit.
  • 207.210.250.147: bailandstump.com. Hit.
  • 207.210.250.148: classicalmusic4arab.com. Hit.
  • 207.210.250.149: globalventurestat.com. Hit.
  • 207.210.250.152: al-rashidrealestate.com. Hit.
  • 207.210.250.153: newsintheworld-ru.com. Hit.
  • 207.210.250.154: news-unlimited.info. Hit.
208.93.112.105 fastnews-online.com. TULIP-SYSTEMS in United States. Checked viewdns.info range: 208.93.112.90 - 208.93.112.155
208.254.38.39 todaysengineering.com. COLO-PREM-VZB in United States.
  • Tested viewdns.info range: 208.254.38.9 - 208.254.38.86. Weirdly empty, doesn't even show the domain iteslf!
  • 68.178.232.100: source: securitytrails.com. 2009-11-24 - 2009-12-11, GoDaddy.com, LLC
208.254.40.117 worldnewsandent.com. COLO-PREM-VZB in United States. whois.arin.net/rest/net/NET-208-192-0-0-1/pft?s=208.254.40.117: Net Range 208.192.0.0 - 208.255.255.255. Tested viewdns.info range: 208.254.40.92 - 208.254.40.135
  • 208.254.40.96: sixty2media.com. Hit.
  • 208.254.40.99: newspoliticssource.com. Hit.
  • 208.254.40.110 musical-fortune.net. Hit.
  • 208.254.40.113: ashoka-gemstones.com. Hit.
  • 208.254.40.117: worldnewsandent.com. Hit.
  • 208.254.40.124: riskandrewardnews.com. Hit.
  • 208.254.40.129: mailb.casella.com. Legit.
208.254.42.205 driversinternationalgolf.com. COLO-PREM-VZB in United States. Tested viewdns.info range: 208.254.42.178 - 208.254.42.233.
209.162.192.49 rastadirect.net. DF-PTL2-3 in Gresham - United States. Source: securitytrails.com and cqcounter.com/site/rastadirect.net.html. Tested viewdns.info: 209.162.192.30 209.162.192.70
* 209.162.192.44: thejewelofsouthamerica.com. Hit.
* 209.162.192.49: rastadirect.net. Hit.
* 209.162.192.51: yellow-chair-report.com. Hit.
* 209.162.192.54: tutkulu-turu.com. Possible hit. domainsbyproxy.com 2008-03-04. Weird style made up exclusively of cut up images, including the text itself where links would normally be. Turkish. Archive a bit weird with images on top of text. 2011 Copyright 2006. Unarchived link to web.archive.org/web/20110129065840/http://tutkulu-turu.com/login.html with title "Kullanıcı adı" (Username). Headline "Online seyahat etmek acenta" translates to "Online travel agency".
* 209.162.192.57: globalnewsreports.net. Hit.
* 209.162.192.59: easytravelsite.net. Hit.
* 209.162.192.70: phrio.com. Off date. viewdns.info/reverseip/?t=1&host=209.162.192.70
210.80.75.55 philippinenewsonline.net. UUNET in Australia. Tested viewdns.info range: 210.80.75.30 - 210.80.75.67
  • 210.80.75.35: aroundtheworldnews.net. No archives. ipinf.ru/domains/210.80.75.33/ disagrees and places it at .33.
  • 210.80.75.36: e-commodities.net. Hit.
  • 210.80.75.37: trekkingtoday.com. Hit.
  • 210.80.75.41: multinews-33.com. Hit.
  • 210.80.75.42: movimientodenticias.com. No archives. cqcounter.com/whois/www/movimientodenticias.com.html blank.
  • 210.80.75.43: gulfandmiddleeastnews.com. Hit.
  • 210.80.75.44: whirlybirdinflight.com. Hit.
  • 210.80.75.45: kings-game.net. Hit.
  • 210.80.75.46: topglobalnewsdaily.com. Hit.
  • 210.80.75.49: recipe-dujour.com. Hit.
  • 210.80.75.53: sportsman-elite.com. Hit.
  • 210.80.75.55: philippinenewsonline.net. Hit.
  • 210.80.75.56: technewsforme.com. Hit.
  • 210.80.75.59: goldeportesnoticias.com. Hit.
  • 210.80.75.68: gigabyte-usa.com. Legit.
212.4.16.232 mynewscheck.com. UUNET in Cassano d'Adda - Italy. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 212.4.16.214 - 212.4.17.198. ipinf.ru/domains/?search=212.4.17.125&cust=1 says they are /19, so .16 and .17 are both the same range from a registration perspective::
212.4.17.38 fightwithoutrules.com. UUNET in Cassano d'Adda - Italy. whois.arin.net/rest/net/NET-208-192-0-0-1/pft?s=208.254.40.117. Net Range: 208.192.0.0 - 208.255.255.255. Organization: Name: Verizon Business. Tested viewdns.info range: see 212.4.16.* above
  • 212.4.17.38: fightwithoutrules.com. Hit.
  • 212.4.17.41: newtechfrontier.com. Hit.
  • 212.4.17.43: smart-travel-consultant.com. Hit.
  • 212.4.17.46: atentlaloc.com. Hit.
  • 212.4.17.53: newsresolution.net. Hit.
  • 212.4.17.56: lesummumdelafinance.com. Hit.
  • 212.4.17.56: thepinnacleoffinance.com. No Wayback machine archives. cqcounter.com/whois/www/thepinnacleoffinance.com.html blank.
  • 212.4.17.61: tech-stop.org. Archive: 2011. Feels likely. No commons found. .org hit? Has subdomain "gear.tech-stop.org" according to 2013 DNS Census, which suggests CGI comms, but no links to it
  • 212.4.17.98: topbillingsite.com. Hit.
  • 212.4.17.122: b2bworldglobal.com. Hit.
  • 212.4.17.125: worldaroundyunnan.com. Hit.
  • 212.4.17.160: localtoglobalnews.com. Hit.
There were also some other reverse IP hits for fightwithoutrules.com, but no CIA websites there:
  • 204.11.56.25 - British Virgin Islands - Confluence Networks Inc - 2013-09-26. Many domains.
  • 208.91.197.19 - British Virgin Islands - Confluence Networks Inc - 2013-05-20. Many domains.
Other hits:
  • 208.91.197.132. rdns source: viewdns.info: "location" : "British Virgin Islands", "owner" : "Confluence Networks Inc", "lastseen" : "2013-09-26". So this is after the previous one, unlikely to be correct.
  • 205.178.189.131. source: securitytrails.com
212.4.18.129 sightseeingnews.com. UUNET in Cassano d'Adda - Italy. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 212.4.18.115 - 212.4.18.148. TODO expand. Interesting wide/sparse range? Or perhaps it's two separate ranges?
212.209.74.105 globalbaseballnews.com. UUNET in Sweden. Tested viewdns.info range: 212.209.74.100 - 212.209.74.132. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches
212.209.79.40 hydradraco.com. UUNET in Sweden. Found with: visual inspection of full 2013 DNS Census virtual host cleanup list just after globalbaseballnews.com. Tested viewdns.info range: 212.209.79.35 - 212.209.79.63
  • 212.209.79.34: fgnl.net. Hit. securitytrails.com provides IP history:
    • 212.209.79.34: 2008-09-01 - 2010-04-19.
    • 212.4.18.133: 2010-04-19 - 2019-06-19. Tested viewdns.info range: 212.4.18.122 - 212.4.18.148
    both under MCI Communications Services, Inc. d/b/a Verizon Business.
  • 212.209.79.37: fitness-sources.com. Hit.
  • 212.209.79.40: hydradraco.com. Hit.
  • 212.209.79.41: noticiasdelmundolatino.com. Hit.
  • 212.209.79.42: suparakuvi.com. Hit.
  • 212.209.79.44: myigadgets.net. Unclear. 2010. tech. Contains some helpers to: iGoogle. This page is very interesting. and quite different from the others, as it contains highly specialized functionality. No known comms found. The choice of homepage languages is also very suspicious: Arabic, Farsi, French, Chinese and Spanish.
  • 212.209.79.46: cetusdelph.com. Hit.
  • 212.209.79.47: willtoworship.com. Hit. domainsbyproxy.com
  • 212.209.79.48: themvconnection.com. Hit.
  • 212.209.79.51: pi-resources.net. Hit.
  • 212.209.79.52: newel-adserver.com. Redirects to newel.com which is legit. cqcounter.com/whois/www/newel-adserver.com.html blank.
  • 212.209.79.53: ourscubaworld.com. Hit.
  • 212.209.79.58: tech-love-home.com. Hit.
  • 212.209.79.60: first-solo-aviation.com. Hit.
  • 212.209.79.61: china-destinations.org. Hit.
212.209.90.84 thenewseditor.com. UUNET in Sweden. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 212.209.90.64 - 212.209.90.99
  • 212.209.90.69: worldedgenews.com. Hit.
  • 212.209.90.72: talkingpointnews.info. Hit.
  • 212.209.90.74: globalinvestmentnews.net. Hit.
  • 212.209.90.75: prebitinvestment.com. Hit.
  • 212.209.90.77: energy-bulb.com 2011. English. energy. Comms not found, but has unarchived link to: web.archive.org/web/20110128182345/https://webmail.energy-bulb.com/login.html. CGI comms variant?
  • 212.209.90.79: freeblink.com. No archives for timerange, then legit. cqcounter.com/whois/www/freeblink.com.html off-style
  • 212.209.90.80: nsmovies.net. Hit.
  • 212.209.90.82: middleeastjournal.net. Hit.
  • 212.209.90.84: thenewseditor.com. Hit.
  • 212.209.90.87: newsandweathersource.com. Hit.
  • 212.209.90.89: pakisports.com. Hit.
  • 212.209.90.90: vriha-aesthetics.com. Hit.
  • 212.209.90.92: amishkanews.com. Hit.
  • 212.209.90.93: theentertainbiz.com. Hit.
  • 212.209.90.94: eurosportssummary.com. Hit.
  • 212.209.91.14: teracom.net. Legit
216.93.248.194 esmundonoticias.com. TWDX in Chelmsford - United States.
216.104.38.114 all-sport-headlines.com. SINGLEHOP-LLC in United States.
216.105.98.152: modernarabicnews.com. SAVVY-NET in United States. Found with: 2013 DNS Census virtual host cleanup heuristic keyword searches. Tested viewdns.info range: 216.105.98.125 - 216.105.98.167
  • 216.105.98.118:
  • 216.105.98.132: europeantravelcafe.com. Hit.
  • 216.105.98.134: fuenteneta.com. Hit.
  • 216.105.98.135: ilat-news.com. Hit.
  • 216.105.98.136: etherealinspirations.net. Hit.
  • 216.105.98.137: the-news-zone.com. Hit.
  • 216.105.98.138: photozoomnews.com. No archives. cqcounter.com/whois/www/photozoomnews.com.html empty
  • 216.105.98.139: cultura-digital.net. Hit.
  • 216.105.98.140: uaeshoppingspree.com. Hit.
  • 216.105.98.141: jabarifootball.com. No archives. "Jabari" is a Swahili/Arabic name[ref]. cqcounter.com/whois/www/jabarifootball.com.html not found.
  • 216.105.98.142: globalreview-ar.com. No archives. Shame, could have been our first Argentinian site. cqcounter.com/whois/www/globalreview-ar.com.html empty.
  • 216.105.98.144: garanziadellasicurezza.com. Hit.
  • 216.105.98.145: montanismoaventura.com. Hit.
  • 216.105.98.146: large-format-news.com. Hit.
  • 216.105.98.147: nepalnewsbrief.com. Hit. dnshistory.org marks it as having IP 2010-03-10 -> 2010-08-15 216.169.148.94 [ref]. This range does feel a bit different from the others, too many broken archives, and relatively early ones too. Explored viewdns.info range: 216.169.148.84 - 216.169.148.104, empty for period. domainsbyproxy.com.
  • 216.105.98.148: teclafinance.com. Hit.
  • 216.105.98.149: entreman.com. Hit.
  • 216.105.98.152: modernarabicnews.com. Hit.
  • 216.105.98.153: global-headlines.com. Hit.
  • 216.105.98.154: everythingcricket.org. Hit.
  • 216.105.98.156: familyhealthonline.net. Hit.
  • 216.105.98.157: delacorne.com. Hit.
  • 216.105.98.158: econfutures.com. Hit.
  • 216.105.98.161: kstcloud.com. No archives. cqcounter.com/whois/www/kstcloud.com.html not found
219.90.61.123 journeystravelled.com. UUNET in Taiwan. Tested viewdns.info range: 219.90.61.100 - 219.90.61.133
219.90.62.243 fitness-dawg.com. UUNET in Taiwan. whois.arin.net/rest/net/NET-219-0-0-0-1/pft?s=219.90.62.243. Net Type: Allocated to APNIC. Tested viewdns.info range: unknown - 219.90.62.255
First we must start the tor servers with the tor-army command from: stackoverflow.com/questions/14321214/how-to-run-multiple-tor-processes-at-once-with-different-exit-ips/76749983#76749983
tor-army 100
and then use it on a newline separated domain name list to check;
./cdx-tor.sh infile.txt
This creates a directory infile.txt.cdx/ containing:
  • infile.txt.cdx/out00, out01, etc.: the suspected CDX lines from domains from each tor instance based on the simple criteria that the CDX can handle directly. We split the input domains into 100 piles, and give one selected pile per tor instance.
  • infile.txt.cdx/out: the final combined CDX output of out00, out01, ...
  • infile.txt.cdx/out.post: the final output containing only domain names that match further CLI criteria that cannot be easily encoded on the CDX query. This is the cleanest domain name list you should look into at the end basically.
Since archive is so abysmal in its data access, e.g. a Google BigQuery would solve our issues in seconds, we have to come up with creative ways of getting around their IP throttling.
The CIA doesn't play fair. They're actually the exact opposite of fair. So neither shall we.
This should allow a full sweep of the 4.5M records in 2013 DNS Census virtual host cleanup in a reasonable amount of time. After JAR/SWF/CGI filtering we obtained 5.8k domains, so a reduction factor of about 1 million with likely very few losses. Not bad.
5.8k is still a bit annoying to fully go over however, so we can also try to count CDX hits to the domains and remove anything with too many hits, since the CIA websites basically have very few archives:
cd 2013-dns-census-a-novirt-domains.txt.cdx
./cdx-tor.sh -d out.post domain-list.txt
cd out.post.cdx
cut -d' ' -f1 out | uniq -c | sort -k1 -n | awk 'match($2, /([^,]+),([^)]+)/, a) {printf("%s.%s %d\n", a[2], a[1], $1)}' > out.count
This gives us something like:
12654montana.com 1
aeronet-news.com 1
atohms.com 1
av3net.com 1
beechstreetas400.com 1
sorted by increasing hit counts, so we can go down as far as patience allows for!
New results from a full CDX scan of 2013-dns-census-a-novirt.csv:
  • 219.90.61.123 journeystravelled.com