Fourier series by Ciro Santilli 37 Updated 2025-07-16
Approximates an original function by sines. If the function is "well behaved enough", the approximation is to arbitrary precision.
Fourier's original motivation, and a key application, is solving partial differential equations with the Fourier series.
The Fourier series behaves really nicely in , where it always exists and converges pointwise to the function: Carleson's theorem.
Video 1.
But what is a Fourier series? by 3Blue1Brown (2019)
Source. Amazing 2D visualization of the decomposition of complex functions.
Khronos Group by Ciro Santilli 37 Updated 2025-07-16
The fact that they kept the standard open source makes them huge heroes, see also: closed standard.
Shame that many (most?) of their proposals just die out.
torchvision ResNet by Ciro Santilli 37 Updated 2025-07-16
That example uses a ResNet pre-trained on the COCO dataset to do some inference, tested on Ubuntu 22.10:
cd python/pytorch
wget -O resnet_demo_in.jpg https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Rooster_portrait2.jpg/400px-Rooster_portrait2.jpg
./resnet_demo.py resnet_demo_in.jpg resnet_demo_out.jpg
This first downloads the model, which is currently 167 MB.
We know it is COCO because of the docs: pytorch.org/vision/0.13/models/generated/torchvision.models.detection.fasterrcnn_resnet50_fpn_v2.html which explains that
FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT
is an alias for:
FasterRCNN_ResNet50_FPN_V2_Weights.COCO_V1
The runtime is relatively slow on P51, about 4.7s.
After it finishes, the program prints the recognized classes:
['bird', 'banana']
so we get the expected bird, but also the more intriguing banana.
By looking at the output image with bounding boxes, we understand where the banana came from!
Figure 1.
python/pytorch/resnet_demo_in.jpg
. Source.
Figure 2.
python/pytorch/resnet_demo_out.jpg
. The beak was of course a banana, not a beak!
So far, no new domains have been found with Common Crawl, nor have any existing known domains been found to be present in Common Crawl. Our working theory is that Common Crawl never reached the domains How did Alexa find the domains?
Let's try and do something with Common Crawl.
Unfortunately there's no IP data apparently: github.com/commoncrawl/cc-index-table/issues/30, so let's focus on the URLs.
Hello world:
select * from "ccindex"."ccindex" limit 100;
Data scanned: 11.75 MB
Sample first output line:
#                            2
url_surtkey                  org,whwheelers)/robots.txt
url                          https://whwheelers.org/robots.txt
url_host_name                whwheelers.org
url_host_tld                 org
url_host_2nd_last_part       whwheelers
url_host_3rd_last_part
url_host_4th_last_part
url_host_5th_last_part
url_host_registry_suffix     org
url_host_registered_domain   whwheelers.org
url_host_private_suffix      org
url_host_private_domain      whwheelers.org
url_host_name_reversed
url_protocol                 https
url_port
url_path                     /robots.txt
url_query
fetch_time                   2021-06-22 16:36:50.000
fetch_status                 301
fetch_redirect               https://www.whwheelers.org/robots.txt
content_digest               3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ
content_mime_type            text/html
content_mime_detected        text/html
content_charset
content_languages
content_truncated
warc_filename                crawl-data/CC-MAIN-2021-25/segments/1623488519183.85/robotstxt/CC-MAIN-20210622155328-20210622185328-00312.warc.gz
warc_record_offset           1854030
warc_record_length           639
warc_segment                 1623488519183.85
crawl                        CC-MAIN-2021-25
subset                       robotstxt
So url_host_3rd_last_part might be a winner for CGI comms fingerprinting!
Naive one for one index:
select * from "ccindex"."ccindex" where url_host_registered_domain = 'conquermstoday.com' limit 100;
have no results... data scanned: 5.73 GB
Let's see if they have any of the domain hits. Let's also restrict by date to try and reduce the data scanned:
select * from "ccindex"."ccindex" where
  fetch_time < TIMESTAMP '2014-01-01 00:00:00' AND
  url_host_registered_domain IN (
   'activegaminginfo.com',
   'altworldnews.com',
   ...
   'topbillingsite.com',
   'worldwildlifeadventure.com'
 )
Humm, data scanned: 60.59 GB and no hits... weird.
Sanity check:
select * from "ccindex"."ccindex" WHERE
  crawl = 'CC-MAIN-2013-20' AND
  subset = 'warc' AND
  url_host_registered_domain IN (
   'google.com',
   'amazon.com'
 )
has a bunch of hits of course. Data scanned: 212.88 MB, WHERE crawl and subset are a must! Should have read the article first.
Let's widen a bit more:
select * from "ccindex"."ccindex" WHERE
  crawl IN (
    'CC-MAIN-2013-20',
    'CC-MAIN-2013-48',
    'CC-MAIN-2014-10'
  ) AND
  subset = 'warc' AND
  url_host_registered_domain IN (
    'activegaminginfo.com',
    'altworldnews.com',
    ...
    'worldnewsandent.com',
    'worldwildlifeadventure.com'
 )
Still nothing found... they don't seem to have any of the URLs of interest?
Suona by Ciro Santilli 37 Updated 2025-07-16
One of the most flashy chinese musical instrument!
It is very similar to the Indian shehnai.
Video 1.
Wang Jin beats Gao Qiu theme music from The Water Margin
. Source.
CIDARLAB/cello by Ciro Santilli 37 Updated 2025-07-16
The input is in Verilog! Overkill?
Then it essentially maps to a standard cell library of biological primitives!

Pinned article: Introduction to the OurBigBook Project

Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
We have two killer features:
  1. topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculus
    Articles of different users are sorted by upvote within each article page. This feature is a bit like:
    • a Wikipedia where each user can have their own version of each article
    • a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
    This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.
    Figure 1.
    Screenshot of the "Derivative" topic page
    . View it live at: ourbigbook.com/go/topic/derivative
  2. local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:
    This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
    Figure 5. . You can also edit articles on the Web editor without installing anything locally.
    Video 3.
    Edit locally and publish demo
    . Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.
  3. https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/x/hilbert-space-arrow.png
  4. Infinitely deep tables of contents:
    Figure 6.
    Dynamic article tree with infinitely deep table of contents
    .
    Descendant pages can also show up as toplevel e.g.: ourbigbook.com/cirosantilli/chordate-subclade
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact