Source: /cirosantilli/cia-2010-covert-communication-websites/secure-subdomain-search-on-2013-dns-census

= secure subdomain search on 2013 DNS Census

Grepping the <2013 DNS Census> first by overused <CGI comms> subdomains `secure.` and `ssl.` leaves 200k lines. Grepping for the overused "news" led to hits:

Also tried but failed:
* `sports`:

OK, after the initial successes in `secure.`, we went a bit more data intensive:
* took all `secure.*` `ssl.*` URLs in the <2013 DNS Census>, 70k entries
* cleaned up a bit, e.g. only `.com` or `.net`. this left only, 30k entries only
* lopped over all of them in archive CDX: <Wayback Machine CDX scanning>, searching for those that also end in `.cgi`\$domain&matchType=domain&filter=urlkey:.*.cgi&to=20140101000000[]. Took an afternoon, but no rate limit block.
* this leaves about 1000, so we loop over all of them manually on web archive with a script, and opened any that had the pattern of very vew hits between 2010 and 2013 only, and on those check for visual/thematic style match. Careful not to make more than 15 requests per minute or else 5 min blacklist!
New results: only one...

After <2013 DNS Census virtual host cleanup heuristic keyword searches> we later understood why there were so few hits here: the <2013 DNS Census> didn't capture the `secure.` subdomains of many domains it had for some reason. Shame, because if it had, this method would have yielded many more results.

{title=You can never have enough <Wayback Machine> tabs open}