It can't be HTML crawl because presumably there wouldn't have been links to those websites? Presumably this is why Common Crawl doesn't seem to have any hits.
The same question also applies to the 2013 DNS Census. It has less hits, but still has many.
Whatever they did, we are so so glad that they did!
Articles by others on the same topic
There are currently no matching articles.