Common Crawl web graph
ID: common-crawl-web-graph
In 2017 apparently they've started making their own Web Graphs, i.e. they parse the HTML and extract the graph of what links to what.
Edit: actually, they already calculate PageRank for us!!! Fantastic!!! Main section: Section "Common Crawl web graph official PageRank".
A quick exploration of the graph can be seen at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Their source code is at: github.com/commoncrawl/cc-webgraph
New to topics? Read the docs here!