Katz centrality by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
Just image being famous only for being 44 years too early to a party.
The downside of "Katz centrality" compared to PageRank appears to be that if if a big node links to many many nodes, all of those earn a lot of reputation, regardless of how outgoing links there are:
Eigenvector centrality by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
This is the family of algorithms to which PageRank
Open PageRank implementation and data by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
This section is about more "open" PageRank implementations, notably using either or both of:
As of 2025, the most open and reproducible implementation appears to be whatever Common Crawl web graph official PageRank does, which is to use WebGraph. It's quite beautiful.
Intel employee grade by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
Common Crawl web graph by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
In 2017 apparently they've started making their own Web Graphs, i.e. they parse the HTML and extract the graph of what links to what.
This is exactly what we need for an open implementation of PageRank.
Edit: actually, they already calculate PageRank for us!!! Fantastic!!! Main section: Section "Common Crawl web graph official PageRank".
The graphs are dumped in BVGraph format.
A quick exploration of the graph can be seen at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Intel department by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
Intel hardware by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
Intel employee by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
École Polytechnique alumnus by year by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
For example, Ciro Santilli's year started in 2009, though as a foreign student he arrived only at the start of 2010, and Ciro's promotion is usually known just as X09. And as the century barrier is broken we'll start to need to specify as X2009 one day.
Rob Pike by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
PhD thesis by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
WebGraph (software) by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
BVGraph by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16
The native file format of WebGraph.
It is a binary format and highly storage efficient.
TODO meaning of "BV"?
A quick hands-on introduction to the format by Ciro Santilli can be found at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Cancer research by Ciro Santilli 37 Created 2025-02-26 Updated 2025-07-16

Unlisted articles are being shown, click here to show only listed articles.