I wanted to do a quick exploration of open PageRank implementation and data.
My general motivation for this is that a PageRank-like algorithm could be useful for more accurate user and article ranking on OurBigBook, see: Section "PageRank-like ranking"
But it could also be just generally cool to apply it to other graph datasets, e.g. for computing an Wikipedia internal PageRank.
A quick Google reveals only Open PageRank, but their methods are apparently closed source.
Then I had a look at the Common Crawl web graph data to see if I could easily calculate it myself, and... they already have it! See: Section "Common Crawl web graph official PageRank"
Their graph dumps are in BVGraph graph file format, which is the native format of the WebGraph framework, which implements the format and algorithms such as PageRank.
The only thing I miss is a command line interface to calculate the PageRank. That would be so awesome.
The more I look at it the more I love Common Crawl.
In cc-main-2024-25-dec-jan-feb-domain-ranks.txt:
  • cirosantilli.com was ranked ~453k
  • ourbigbook.com was at ~606k
White-East asian mixed by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Multiracial by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
White people by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Atherton, California by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
San Mateo County by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Municipality in Illinois by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
County in California by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
East Asian by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Eastern Europe by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Apple Verdiell by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Apolline Verdiell by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Aurelie Verdiell by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Hoang Oanh Verdiell by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Western pseudonym of East Asian person by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16
Many East Asians, notably Chinese immigrants, choose to adopt a Western name pseudonym to make it easier for Western people who don't speak the language to call them and remember their name.
Cowards! Ciro Santilli would much rather just torture foreigners into learning his language. But fair play.
More interestingly however, some of the names chosen are not typical names, and some end up being very cute or mildly funny. Perhaps it is partly linked to given names are getting weirder.
Marc Verdiell's children by Ciro Santilli 37 Created 2025-02-23 Updated 2025-07-16

Unlisted articles are being shown, click here to show only listed articles.