NIST Post-Quantum Cryptography Standardization Updated +Created
This post-quantum cryptography competition by NIST is a huge milestone of the field.
It was mind blowing when in 2022, after several years of selection, one of the 7 finalists was broken on a classical computer, not even in a quantum computer! news.ycombinator.com/item?id=30466063 | eprint.iacr.org/2022/214 Breaking Rainbow Takes a Weekend on a Laptop by Ward Beullens. Dude announced he had a break a few days before submission: twitter.com/WardBeullens/status/1492780462028300290 On Twitter. He's so young. Epic.
Edit: and then, after the third round, things were a bit unclear, so they made a fourth round with 4 choices out of the 7 from round 3, and in August 2022 one of the four was broken again on a classic CPU!!! OMG: arstechnica.com/information-technology/2022/08/sike-once-a-post-quantum-encryption-contender-is-koed-in-nist-smackdown/
Tesla (unit) Updated +Created
yolov5-pip Updated +Created
OK, now we're talking, two liner and you get a window showing bounding box object detection from your webcam feed!
python -m pip install -U yolov5==7.0.9
yolov5 detect --source 0
The accuracy is crap for anything but people. But still. Well done. Tested on Ubuntu 22.10, P51.
Video 1.
fcakyon/yolov5-pip webcam object detection demo by Ciro Santilli (2023)
Source.
Common Crawl Updated +Created
Amazing project, that basically makes a more searchable Wayback Machine.
A bit hard to use their data though, partly due to size, but also lack of free to use querrying mechanisms, and how obtuse Amazon S3 is to use.
Notably, aws-cli with an account is the only reliable way, everything else is way too broken, e.g. trying the to check the an index index.commoncrawl.org/CC-MAIN-2023-06/ very often 500s.
But still, their projct is amazing.
The only out-of-the-box search they seem to have is: urlsearch.commoncrawl.org/ for domains/URLs. It is good, but there could be so much more... notably IPs.
Also could should document the data shape a bit better.
To explore the data, after login:
aws s3 ls s3://commoncrawl/crawl-data/CC-MAIN-2013-20/
Copy the toplevel directory only:
aws s3 cp s3://commoncrawl/crawl-data/CC-MAIN-2013-20/ . --recursive --exclude "*/*"
Copy some wet/wat files:
aws s3 cp s3://commoncrawl/crawl-data/CC-MAIN-2013-20/segments/1368696381249/wat/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wat.gz .
aws s3 sync s3://commoncrawl/crawl-data/CC-MAIN-2013-20/segments/1368696381249/wet/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wet.gz .
Directory structrure:
  • cc-index.paths.gz (1K)
  • cc-index-table.paths.gz (1K)
  • segment.paths.gz (1.7K) Sample lines:
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/
    crawl-data/CC-MAIN-2013-20/segments/1368696381630/
  • index.html (2.3K)
  • wat.paths.gz (98K) Sample lines:
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/wat/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wat.gz
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/wat/CC-MAIN-20130516092621-00001-ip-10-60-113-184.ec2.internal.warc.wat.gz
  • wet.paths.gz (98K) Sample lines:
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/wet/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wet.gz
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/wet/CC-MAIN-20130516092621-00001-ip-10-60-113-184.ec2.internal.warc.wet.gz
  • warc.paths.gz (99K)
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/warc/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz
    crawl-data/CC-MAIN-2013-20/segments/1368696381249/warc/CC-MAIN-20130516092621-00001-ip-10-60-113-184.ec2.internal.warc.gz
  • segments: directgory with actual data
    • 1368696381249: one of many segments, any meaning of name?
      • CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wet.gz (142M, 334M unzipped)
        A tiny bit of metadata, and then plaintext content from the website, e.g. the second one:
        WARC/1.0
        WARC-Type: conversion
        WARC-Target-URI: http://004eeb5.netsolhost.com/stephensilver.htm
        WARC-Date: 2013-05-18T08:11:02Z
        WARC-Record-ID: <urn:uuid:773b31ba-ddc6-47a5-ae24-d08141b9944d>
        WARC-Refers-To: <urn:uuid:4b1bdbff-4926-4ced-86f6-072f5bb3837a>
        WARC-Block-Digest: sha1:LQFSCR2LIJQYMPTXRHWU7HAPQTVSYS3A
        Content-Type: text/plain
        Content-Length: 12046
        
        Stephen Silver is a journalist and editor who specializes in the areas of politics, pop culture, film and sports. He works as an editor with the North American Publishing Co. and as a film critic with The Trend, a local newspaper in the Philadelphia area.
        No IP unfortunately.
      • CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.wat.gz (329M, 1.4G unzipped)
        A lot of JSON metadata and no contents as desired. Contains IP! Some entries however are humongous with a ton of useless data, that's what bloats these so much:
        WARC/1.0
        WARC-Type: metadata
        WARC-Target-URI: CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz
        WARC-Date: 2013-11-22T14:51:12Z
        WARC-Record-ID: <urn:uuid:ec54e493-8965-41be-b344-07596cc30b3a>
        WARC-Refers-To: <urn:uuid:cfeff436-7c4c-4119-aaa4-ec2ce27ad3e1>
        Content-Type: application/json
        Content-Length: 1180
        
        {"Envelope":{"Format":"WARC","WARC-Header-Length":"274","Block-Digest":"sha1:JCZOI4V3UOTXGIRLFMPLW4J2WPLAKGVR","Actual-Content-Length":"372","WARC-Header-Metadata":{"WARC-Type":"warcinfo","WARC-Filename":"CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz","WARC-Date":"2013-11-22T14:51:12Z","Content-Length":"372","WARC-Record-ID":"<urn:uuid:cfeff436-7c4c-4119-aaa4-ec2ce27ad3e1>","Content-Type":"application/warc-fields"},"Payload-Metadata":{"Trailing-Slop-Length":"0","Actual-Content-Type":"application/warc-fields","Actual-Content-Length":"372","Headers-Corrupt":true,"WARC-Info-Metadata":{"robots":"classic","software":"Nutch 1.6 (CC)/CC WarcExport 1.0","description":"Wide crawl of the web with URLs provided by Blekko for Spring 2013","hostname":"ip-10-60-113-184.ec2.internal","format":"WARC File Format 1.0","isPartOf":"CC-MAIN-2013-20","operator":"CommonCrawl Admin","publisher":"CommonCrawl"}}},"Container":{"Compressed":true,"Gzip-Metadata":{"Footer-Length":"8","Deflate-Length":"453","Header-Length":"10","Inflated-CRC":"866052549","Inflated-Length":"650"},"Offset":"0","Filename":"CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz"}}
        
        WARC/1.0
        WARC-Type: metadata
        WARC-Target-URI: http://%20jwashington@ap.org/Content/Press-Release/2012/How-AP-reported-in-all-formats-from-tornado-stricken-regions
        WARC-Date: 2013-05-18T05:48:54Z
        WARC-Record-ID: <urn:uuid:d519658f-7a63-46c1-849b-4cd92332ddb8>
        WARC-Refers-To: <urn:uuid:cefd363b-1fec-4590-8305-4c6fab2e095f>
        Content-Type: application/json
        Content-Length: 1501
        
        {"Envelope":{"Format":"WARC","WARC-Header-Length":"433","Block-Digest":"sha1:B2B6JDSGWCUQIIUGV54SXEE25RX4SANS","Actual-Content-Length":"302","WARC-Header-Metadata":{"WARC-Type":"request","WARC-Date":"2013-05-18T05:48:54Z","WARC-Warcinfo-ID":"<urn:uuid:cfeff436-7c4c-4119-aaa4-ec2ce27ad3e1>","Content-Length":"302","WARC-Record-ID":"<urn:uuid:cefd363b-1fec-4590-8305-4c6fab2e095f>","WARC-Target-URI":"http://%20jwashington@ap.org/Content/Press-Release/2012/How-AP-reported-in-all-formats-from-tornado-stricken-regions","WARC-IP-Address":"165.1.125.44","Content-Type":"application/http; msgtype=request"},"Payload-Metadata":{"Trailing-Slop-Length":"4","HTTP-Request-Metadata":{"Headers":{"Accept-Language":"en-us,en-gb,en;q=0.7,*;q=0.3","Host":"ap.org","Accept-Encoding":"x-gzip, gzip, deflate","User-Agent":"CCBot/2.0","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"},"Headers-Length":"300","Entity-Length":"0","Entity-Trailing-Slop-Bytes":"0","Request-Message":{"Method":"GET","Version":"HTTP/1.0","Path":"/Content/Press-Release/2012/How-AP-reported-in-all-formats-from-tornado-stricken-regions"},"Entity-Digest":"sha1:3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ"},"Actual-Content-Type":"application/http; msgtype=request"}},"Container":{"Compressed":true,"Gzip-Metadata":{"Footer-Length":"8","Deflate-Length":"455","Header-Length":"10","Inflated-CRC":"453539965","Inflated-Length":"739"},"Offset":"453","Filename":"CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz"}}
        Let's beautify one of them to see it better:
        
        {
          "Envelope": {
            "Format": "WARC",
            "WARC-Header-Length": "274",
            "Block-Digest": "sha1:JCZOI4V3UOTXGIRLFMPLW4J2WPLAKGVR",
            "Actual-Content-Length": "372",
            "WARC-Header-Metadata": {
              "WARC-Type": "warcinfo",
              "WARC-Filename": "CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz",
              "WARC-Date": "2013-11-22T14:51:12Z",
              "Content-Length": "372",
              "WARC-Record-ID": "<urn:uuid:cfeff436-7c4c-4119-aaa4-ec2ce27ad3e1>",
              "Content-Type": "application/warc-fields"
            },
            "Payload-Metadata": {
              "Trailing-Slop-Length": "0",
              "Actual-Content-Type": "application/warc-fields",
              "Actual-Content-Length": "372",
              "Headers-Corrupt": true,
              "WARC-Info-Metadata": {
                "robots": "classic",
                "software": "Nutch 1.6 (CC)/CC WarcExport 1.0",
                "description": "Wide crawl of the web with URLs provided by Blekko for Spring 2013",
                "hostname": "ip-10-60-113-184.ec2.internal",
                "format": "WARC File Format 1.0",
                "isPartOf": "CC-MAIN-2013-20",
                "operator": "CommonCrawl Admin",
                "publisher": "CommonCrawl"
              }
            }
          },
          "Container": {
            "Compressed": true,
            "Gzip-Metadata": {
              "Footer-Length": "8",
              "Deflate-Length": "453",
              "Header-Length": "10",
              "Inflated-CRC": "866052549",
              "Inflated-Length": "650"
            },
            "Offset": "0",
            "Filename": "CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz"
          }
        }
        Fuck no IP addresses either. But other entries do have it, why not this one?
        The reason these can be huge is the HTML-Metadata section which contain all outlinks! gist.github.com/Smerity/e750f0ef0ab9aa366558#file-bbc-pretty-wat-L34
      • CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz ()
        Obtain:
        aws s3 cp s3://commoncrawl/crawl-data/CC-MAIN-2013-20/segments/1368696381249/warc/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz .
Competitive programming website Updated +Created
Complete metric space Updated +Created
In plain English: the space has no visible holes. If you start walking less and less on each step, you always converge to something that also falls in the space.
One notable example where completeness matters: Lebesgue integral of is complete but Riemann isn't.
Create a test user in PostgreSQL Updated +Created
In order to create a test user with password instead of peer authentication, let's create test user:
createuser -P user0
createdb user0
-P makes it prompt for the users password.
Alternatively, to create the password non-interactively stackoverflow.com/questions/42419559/postgres-createuser-with-password-from-terminal:
psql -c "create role NewRole with login password 'secret'"
Can't find a way using the createuser helper.
We can then login with that password with:
psql -U user0 -h localhost
which asks for the password we've just set, because the -h option turns off peer authentication, and turns off password authentication.
The password can be given non-interactively as shown at stackoverflow.com/questions/6405127/how-do-i-specify-a-password-to-psql-non-interactively with the PGPASSWORD environment variable:
PGPASSWORD=a psql -U user0 -h localhost
Now let's create a test database which user0 can access with an existing superuser account:
createdb user0db0
psql -c 'GRANT ALL PRIVILEGES ON DATABASE user0db0 TO user0'
We can check this permission with:
psql -c '\l'
which now contains:
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 user0db0  | ciro     | UTF8     | en_GB.UTF-8 | en_GB.UTF-8 | =Tc/ciro             +
           |          |          |             |             | ciro=CTc/ciro        +
           |          |          |             |             | user0=CTc/ciro
The permission letters are explained at:
user0 can now do the usual table operations on that table:
PGPASSWORD=a psql -U user0 -h localhost user0db0 -c 'CREATE TABLE table0 (int0 INT, char0 CHAR(16));'
PGPASSWORD=a psql -U user0 -h localhost user0db0 -c "INSERT INTO table0 (int0, char0) VALUES (2, 'two'), (3, 'three'), (5, 'five'), (7, 'seven');"
PGPASSWORD=a psql -U user0 -h localhost user0db0 -c 'SELECT * FROM table0;'
Cryogenic electron microscopy Updated +Created
This technique has managed to determine protein 3D structures for proteins that people were not able to crystallize for X-ray crystallography.
It is said however that cryoEM is even fiddlier than X-ray crystallography, so it is mostly attempted if crystallization attempts fail.
By looking at Figure 1. "A cryoEM image", you can easily understand the basics of cryoEM.
We just put a gazillion copies of our molecule of interest in a solution, and then image all of them in the frozen water.
Each one of them appears in the image in a random rotated view, so given enough of those point of view images, we can deduce the entire 3D structure of the molecule.
Ciro Santilli once watched a talk by Richard Henderson about cryoEM circa 2020, where he mentioned that he witnessed some students in the 1980's going to Germany, and coming into contact with early cryoEM. And when they came back, they just told their principal investigator: "I'm going to drop my PhD theme and focus exclusively on cryoEM". That's how hot the cryo thing was! So cool.
Figure 1.
A cryoEM image
. Source. This is the type of image that you get out of a raw CryoEM experiment.
Video 1.
The structure of our cells by Matteo Allegretti
. Source. The start is useless. But the end at this timestamp shows an interesting technique where they actually cut up cells in fine slices and image them, that's cool.
Fandom (website) Updated +Created
Ciro's Edict #4 Updated +Created
Yitang Zhang's theorem Updated +Created
There are infinitely many primes with a neighbor not further apart than 70 million. This was the first such finite bound to be proven, and therefore a major breakthrough.
This implies that for at least one value (or more) below 70 million there are infinitely many repetitions, but we don't know which e.g. we could have infinitely many:
or infinitely many:
or infinitely many:
or infinitely many:
but we don't know which of those.
The Prime k-tuple conjecture conjectures that it is all of them.
Also, if 70 million could be reduced down to 2, we would have a proof of the Twin prime conjecture, but this method would only work for (k, k + 2).
Antoine de Saint-Exupéry Updated +Created
Computer manufacturer Updated +Created
This section is about companies that integrate parts and software from various other companies to make up fully working computer systems.
Computer science course of the University of Oxford Updated +Created
Course lists: www.cs.ox.ac.uk/teaching/courses/ True to form, courses appear to have identifiers, e.g. qi for the Quantum Information course of the University of Oxford rather than more arbitrary A1/A2/A3, B1/B2/B3, naming convention used by the Mathematics course of the University of Oxford and the Physics course of the University of Oxford, and URLs can either have years or not:
The "course materials" section of each course leads to courses.cs.ox.ac.uk/ which is paywalled by IP (accessible via Eduroam): TODO which system does it use? Some courses place their materials directly on "www.cs.ox.ac.uk", and when that is the case they are publicly accessible. So it is very much hit and miss. E.g. www.cs.ox.ac.uk/teaching/courses/2022-2023/quantum/index.html from Quantum Processes and Computation course of the University of Oxford has the assignments such as www.cs.ox.ac.uk/people/aleks.kissinger/courses/qpc2022/assignment1.pdf publicly visible, but e.g. www.cs.ox.ac.uk/teaching/courses/2022-2023/modelsofcomputation/ has nothing.
Handbook:
Condenser microphone Updated +Created
Cuisine by region Updated +Created
Cycling lobbying entity Updated +Created
Daisy chain Bitcoin inscription Updated +Created
This is a term invented by Ciro Santilli, and refers to a loose set of uncommon Bitcoin inscription methods that involve inscribing one or a small number of payloads per Bitcoin transaction.
These methods are both inefficient and hard to detect and decode, partly because Bitcoin Core does not index spending transactions: bitcoin.stackexchange.com/questions/61794/bitcoin-rpc-how-to-find-the-transaction-that-spends-a-txo. This makes finding them all that more rewarding however.
On the other hand, they do have the advantage of not depending on any block size limits, as their individual transactions are very small.
Inscribing anything large would however take a very long time, as you'd have to wait until the previous payload chunk is confirmed before going to the next one. This alone makes the format impractical perhaps.
Data breach Updated +Created

There are unlisted articles, also show them or only show them.