Source: /cirosantilli/cia-2010-covert-communication-websites/secure-subdomain-search-on-2013-dns-census

= secure subdomain search on 2013 DNS Census

Grepping the <2013 DNS Census> first by overused <CGI comms> subdomains `secure.` and `ssl.` leaves 200k lines. Grepping for the overused "news" led to hits:
* secure.worldnewsandent.com,2012-02-13T21:28:15,208.254.40.117
* ssl.beyondnetworknews.com,2012-02-13T20:10:13,66.104.175.40

Also tried but failed:
* `sports`:
  * secure.motorsportdealers.com,2012-04-10T20:19:09,64.73.117.38 https://web.archive.org/web/20110501000000*/motorsportdealers.com[]

OK, after the initial successes in `secure.`, we went a bit more data intensive:
* took all `secure.*` `ssl.*` URLs in the <2013 DNS Census>, 70k entries
* cleaned up a bit, e.g. only `.com` or `.net`. this left only, 30k entries only
* lopped over all of them in archive CDX: <Wayback Machine CDX scanning>, searching for those that also end in `.cgi` https://web.archive.org/cdx/search/cdx?url=\$domain&matchType=domain&filter=urlkey:.*.cgi&to=20140101000000[]. Took an afternoon, but no rate limit block.
* this leaves about 1000, so we loop over all of them manually on web archive with a script, and opened any that had the pattern of very vew hits between 2010 and 2013 only, and on those check for visual/thematic style match. Careful not to make more than 15 requests per minute or else 5 min blacklist!
New results: only one...
* 208.254.42.205 secure.driversinternationalgolf.com,2012-02-13T10:42:20,

After <2013 DNS Census virtual host cleanup heuristic keyword searches> we later understood why there were so few hits here: the <2013 DNS Census> didn't capture the `secure.` subdomains of many domains it had for some reason. Shame, because if it had, this method would have yielded many more results.

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/cia-2010-covert-communication-websites/archive-tabs.png]
{title=You can never have enough <Wayback Machine> tabs open}