Grepping the 2013 DNS Census first by overused CGI comms subdomains
secure.
and ssl.
leaves 200k lines. Grepping for the overused "news" led to hits:- secure.worldnewsandent.com,2012-02-13T21:28:15,208.254.40.117
- ssl.beyondnetworknews.com,2012-02-13T20:10:13,66.104.175.40
Also tried but failed:
sports
:- secure.motorsportdealers.com,2012-04-10T20:19:09,64.73.117.38 web.archive.org/web/20110501000000*/motorsportdealers.com
OK, after the initial successes in New results: only one...
secure.
, we went a bit more data intensive:- took all
secure.*
ssl.*
URLs in the 2013 DNS Census, 70k entries - cleaned up a bit, e.g. only
.com
or.net
. this left only, 30k entries only - lopped over all of them in archive CDX: Wayback Machine CDX scanning, searching for those that also end in
.cgi
web.archive.org/cdx/search/cdx?url=$domain&matchType=domain&filter=urlkey:.*.cgi&to=20140101000000. Took an afternoon, but no rate limit block. - this leaves about 1000, so we loop over all of them manually on web archive with a script, and opened any that had the pattern of very vew hits between 2010 and 2013 only, and on those check for visual/thematic style match. Careful not to make more than 15 requests per minute or else 5 min blacklist!
- 208.254.42.205 secure.driversinternationalgolf.com,2012-02-13T10:42:20,
After 2013 DNS Census virtual host cleanup heuristic keyword searches we later understood why there were so few hits here: the 2013 DNS Census didn't capture the
secure.
subdomains of many domains it had for some reason. Shame, because if it had, this method would have yielded many more results.Articles by others on the same topic
There are currently no matching articles.