This ISA basically completely dominated the smartphone market of the 2010s and beyond, but it started appearing in other areas as the end of Moore's law made it more economical logical for large companies to start developing their own semiconductor, e.g. Google custom silicon, Amazon custom silicon.
It is exciting to see ARM entering the server, desktop and supercomputer market circa 2020, beyond its dominant mobile position and roots.
Ciro Santilli likes to see the underdogs rise, and bite off dominant ones.
The excitement also applies to RISC-V possibly over ARM mobile market one day conversely however.
Basically, as long as were a huge company seeking to develop a CPU and able to control your own ecosystem independently of Windows' desktop domination (held by the need for backward compatibility with a billion end user programs), ARM would be a possibility on your mind.
- in 2020, the Fugaku supercomputer, which uses an ARM-based Fujitsu designed chip, because the number 1 fastest supercomputer in TOP500: www.top500.org/lists/top500/2021/11/It was later beaten by another x86 supercomputer www.top500.org/lists/top500/2022/06/, but the message was clearly heard.
- 2012 hackaday.com/2012/07/09/pedal-powered-32-core-arm-linux-server/ pedal-powered 32-core Arm Linux server. A publicity stunt, but still, cool.
- AWS Graviton
These are the best articles ever authored by Ciro Santilli, most of them in the format of Stack Overflow answers.
Ciro posts update about new articles on his Twitter accounts.
A chronological list of all articles is also kept at: Section "Updates".
Some random generally less technical in-tree essays will be present at: Section "Essays by Ciro Santilli".
- Trended on Hacker News:
- CIA 2010 covert communication websites on 2023-06-11. 190 points, a mild success.
- x86 Bare Metal Examples on 2019-03-19. 513 points. The third time something related to that repo trends. Hacker news people really like that repo!
- again 2020-06-27 (archive). 200 points, repository traffic jumped from 25 daily unique visitors to 4.6k unique visitors on the day
- How to run a program without an operating system? on 2018-11-26 (archive). 394 points. Covers x86 and ARM
- ELF Hello World Tutorial on 2017-05-17 (archive). 334 points.
- x86 Paging Tutorial on 2017-03-02. Number 1 Google search result for "x86 Paging" in 2017-08. 142 points.
- x86 assembly
- What does "multicore" assembly language look like?
- What is the function of the push / pop instructions used on registers in x86 assembly? Going down to memory spills, register allocation and graph coloring.
- Linux kernel
- What do the flags in /proc/cpuinfo mean?
- How does kernel get an executable binary file running under linux?
- How to debug the Linux kernel with GDB and QEMU?
- Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths?
- What is the difference between the kernel space and the user space?
- Is there any API for determining the physical address from virtual address in Linux?
- Why do people write the
#!/usr/bin/env
python shebang on the first line of a Python script? - How to solve "Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)"?
- Single program Linux distro
- QEMU
- gcc and Binutils:
- How do linkers and address relocation works?
- What is incremental linking or partial linking?
- GOLD (
-fuse-ld=gold
) linker vs the traditional GNU ld and LLVM ldd - What is the -fPIE option for position-independent executables in GCC and ld? Concrete examples by running program through GDB twice, and an assembly hello world with absolute vs PC relative load.
- How many GCC optimization levels are there?
- Why does GCC create a shared object instead of an executable binary according to file?
- C/C++: almost all of those fall into "disassemble all the things" category. Ciro also does "standards dissection" and "a new version of the standard is out" answers, but those are boring:
- What does "static" mean in a C program?
- In C++ source, what is the effect of
extern "C"
? - Char array vs Char Pointer in C
- How to compile glibc from source and use it?
- When should
static_cast
,dynamic_cast
,const_cast
andreinterpret_cast
be used? - What exactly is
std::atomic
in C++?. This answer was originally more appropriately entitled "Let's disassemble some stuff", and got three downvotes, so Ciro changed it to a more professional title, and it started getting upvotes. People judge books by their covers. notmain.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int) main.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
- IEEE 754
- What is difference between quiet NaN and signaling NaN?
- In Java, what does NaN mean?
Without subnormals: +---+---+-------+---------------+-------------------------------+ exponent | ? | 0 | 1 | 2 | 3 | +---+---+-------+---------------+-------------------------------+ | | | | | | v v v v v v ----------------------------------------------------------------- floats * **** * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127 With subnormals: +-------+-------+---------------+-------------------------------+ exponent | 0 | 1 | 2 | 3 | +-------+-------+---------------+-------------------------------+ | | | | | v v v v v ----------------------------------------------------------------- floats * * * * * * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127
- Computer science
- Algorithms
- Is it necessary for NP problems to be decision problems?
- Polynomial time and exponential time. Answered focusing on the definition of "exponential time".
- What is the smallest Turing machine where it is unknown if it halts or not?. Answer focusing on "blank tape" initial condition only. Large parts of it are summarizing the Busy Beaver Challenge, but some additions were made.
- Algorithms
- Git
| 0 | 4 | 8 | C | |-------------|--------------|-------------|----------------| 0 | DIRC | Version | File count | ctime ...| 0 | ... | mtime | device | 2 | inode | mode | UID | GID | 2 | File size | Entry SHA-1 ...| 4 | ... | Flags | Index SHA-1 ...| 4 | ... |
tree {tree_sha} {parents} author {author_name} <{author_email}> {author_date_seconds} {author_date_timezone} committer {committer_name} <{committer_email}> {committer_date_seconds} {committer_date_timezone} {commit message}
- How do I clone a subdirectory only of a Git repository?
- Python
- Web technology
- OpenGL
- What are shaders in OpenGL?
- Why do we use 4x4 matrices to transform things in 3D?
- Image Processing with GLSL shaders? Compared the CPU and GPU for a simple blur algorithm.
- Node.js
- Ruby on Rails
- POSIX
- What is POSIX? Huge classified overview of the most important things that POSIX specifies.
- Systems programming
- What do the terms "CPU bound" and "I/O bound" mean?
+--------+ +------------+ +------+ | device |>---------------->| function 0 |>----->| BAR0 | | | | | +------+ | |>------------+ | | | | | | | +------+ ... ... | | |>----->| BAR1 | | | | | | +------+ | |>--------+ | | | +--------+ | | ... ... ... | | | | | | | | +------+ | | | |>----->| BAR5 | | | +------------+ +------+ | | | | | | +------------+ +------+ | +--->| function 1 |>----->| BAR0 | | | | +------+ | | | | | | +------+ | | |>----->| BAR1 | | | | +------+ | | | | ... ... ... | | | | | | +------+ | | |>----->| BAR5 | | +------------+ +------+ | | | ... | | | +------------+ +------+ +------->| function 7 |>----->| BAR0 | | | +------+ | | | | +------+ | |>----->| BAR1 | | | +------+ | | ... ... ... | | | | +------+ | |>----->| BAR5 | +------------+ +------+
- Electronics
- Computer security
- Media
- How to resize a picture using ffmpeg's sws_scale()?
- Is there any decent speech recognition software for Linux? ran a few examples manually on
vosk-api
and compared to ground truth.
- Eclipse
- Computer hardware
- Scientific visualization software
- Numerical analysis
- Computational physics
- Register transfer level languages like Verilog and VHDL
- Android
- Debugging
- Program optimization
- Data
- Mathematics
- Section "Formalization of mathematics": some early thoughts that could be expanded. Ciro almost had a stroke when he understood this stuff in his teens.
- Network programming
- Physics
- Biology
- Quantum computing
- Bitcoin
- GIMP
- Home DIY
- China
If you are a pussy and work a soul crushing job, this is one way to lie to yourself that your life is still worth living: do one cool thing every day.
Find a time in which your mind hasn't yet been destroyed by useless work, usually in the morning before work, and do one thing you actually like in life.
Work a little less well for you boss, and a little better for yourself. Ross Ulbricht:Selling drugs online is not advisable however.
I hated working for someone else and trading my time for money with no investment in myself
Even better, try to reach an official agreement with your employer to work 20% less than the standard work week. For example, you could work one day less every week, and do whatever you want on that day. It is not possible to push your passion to weekends, because your brain is too tired. "You keep all non-company-related IP you develop on that time" is a key clause obviously.
On a related note, good employers must allow employees to do whichever the fuck "crazy projects", "needed refactorings or other efficiency gains" and "learn things deeply" at least 20% of their time if employees want that: en.wikipedia.org/wiki/20%25_Project. Employees must choose if they want to do it one day a week or two hours per day. One day per month initiatives are bullshit. Another related name: genius hour.
Highly relevant on this topic: Video "What Predicts Academic Ability? by Jordan B Peterson (2017)".
Maybe you will be fired, but long term, having tried, or even succeeded your dream, or a one of its side effects, will be infinitely more satisfying.
The same goes for school, and maybe even more so because your parents can still support you there. Some Gods who actually followed this advice and didn't end up living under a bridge:
- George M. Church "[We] hope that whatever problems... contributed to your lack of success... at Duke will not keep you from a successful pursuit of a productive career." Lol, as of 2019 the dude is the most famous biotechnologist in the world, those "problems" certainly didn't keep him back.
- Freeman Dyson proved the equivalence of the three existing versions of quantum electrodynamics theories that were around at his time, and he has always been proud of not having a PhD!
- Ramanujan, from Wikipedia:
He received a scholarship to study at Government Arts College, Kumbakonam, but was so intent on mathematics that he could not focus on any other subjects and failed most of them, losing his scholarship in the process.
- Person that Ciro met personally and shall remain anonymous for now for his privacy: once Ciro was at a bar with work colleagues casually, it was cramped, and an older dude sat next to his group.The dude then started a conversation with Ciro, and soon he explained that he was a mathematician and software engineer.As a Mathematician, he had contributed to the classification of finite simple groups, and had a short Wiki page because of that.He never did a PhD, and said that academia was a waste of time, and that you can get as much done by working part time a decent job and doing your research part time, since you skip all the bullshit of academia like this.Yet, he was still invited by collaborating professors to give classes on his research subject in one of the most prestigious universities in the world. Students would call him Doctor X., and he would correct them: Mister X.As a software engineer, he had done a lot of hardcore assembly level optimizations for x86 for some mathematical libraries related to his mathematics interests. He started talking microarchitecture with Ciro's colleagues.And he currently worked on an awesome open source project backed by a company.At last but not least, he said he also fathered 17 children by donating his sperm to lesbian mothers found on a local gay magazine, and that he had met most/all of those children after they were born.A God. Possibly the most remarkable person Ciro ever met, and his jaw was truly dropped.
Gandhi TODO source:
You can chain me, you can torture me, you can even destroy this body, but you will never imprison my mind
This is the most important technical tutorial project that Ciro Santilli has done in his life so far as of 2019.
The scope is insane and unprecedented, and goes beyond Linux kernel-land alone, which is where it started.
It ended up eating every system programming content Ciro had previously written! Including:so that that repo would better be called "System Programming Cheat". But "Linux Kernel Module Cheat" sounds more hardcore ;-)
Other major things that could be added there as well in the future are:
- github.com/cirosantilli/algorithm-cheat
- computer architecture tutorials with gem5
Due to this project, some have considered Ciro to be (archive):which made Ciro smile, although "Linux kernel documenter God" would have been more precise.
some kind of Linux kernel god.
[ 1.451857] input: AT Translated Set 2 keyboard as /devices/platform/i8042/s1│loading @0xffffffffc0000000: ../kernel_modules-1.0//timer.ko
[ 1.454310] ledtrig-cpu: registered to indicate activity on CPUs │(gdb) b lkmc_timer_callback
[ 1.455621] usbcore: registered new interface driver usbhid │Breakpoint 1 at 0xffffffffc0000000: file /home/ciro/bak/git/linux-kernel-module
[ 1.455811] usbhid: USB HID core driver │-cheat/out/x86_64/buildroot/build/kernel_modules-1.0/./timer.c, line 28.
[ 1.462044] NET: Registered protocol family 10 │(gdb) c
[ 1.467911] Segment Routing with IPv6 │Continuing.
[ 1.468407] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver │
[ 1.470859] NET: Registered protocol family 17 │Breakpoint 1, lkmc_timer_callback (data=0xffffffffc0002000 <mytimer>)
[ 1.472017] 9pnet: Installing 9P2000 support │ at /linux-kernel-module-cheat//out/x86_64/buildroot/build/
[ 1.475461] sched_clock: Marking stable (1473574872, 0)->(1554017593, -80442)│kernel_modules-1.0/./timer.c:28
[ 1.479419] ALSA device list: │28 {
[ 1.479567] No soundcards found. │(gdb) c
[ 1.619187] ata2.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100 │Continuing.
[ 1.622954] ata2.00: configured for MWDMA2 │
[ 1.644048] scsi 1:0:0:0: CD-ROM QEMU QEMU DVD-ROM 2.5+ P5│Breakpoint 1, lkmc_timer_callback (data=0xffffffffc0002000 <mytimer>)
[ 1.741966] tsc: Refined TSC clocksource calibration: 2904.010 MHz │ at /linux-kernel-module-cheat//out/x86_64/buildroot/build/
[ 1.742796] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x29dc0f4s│kernel_modules-1.0/./timer.c:28
[ 1.743648] clocksource: Switched to clocksource tsc │28 {
[ 2.072945] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8043│(gdb) bt
[ 2.078641] EXT4-fs (vda): couldn't mount as ext3 due to feature incompatibis│#0 lkmc_timer_callback (data=0xffffffffc0002000 <mytimer>)
[ 2.080350] EXT4-fs (vda): mounting ext2 file system using the ext4 subsystem│ at /linux-kernel-module-cheat//out/x86_64/buildroot/build/
[ 2.088978] EXT4-fs (vda): mounted filesystem without journal. Opts: (null) │kernel_modules-1.0/./timer.c:28
[ 2.089872] VFS: Mounted root (ext2 filesystem) readonly on device 254:0. │#1 0xffffffff810ab494 in call_timer_fn (timer=0xffffffffc0002000 <mytimer>,
[ 2.097168] devtmpfs: mounted │ fn=0xffffffffc0000000 <lkmc_timer_callback>) at kernel/time/timer.c:1326
[ 2.126472] Freeing unused kernel memory: 1264K │#2 0xffffffff810ab71f in expire_timers (head=<optimized out>,
[ 2.126706] Write protecting the kernel read-only data: 16384k │ base=<optimized out>) at kernel/time/timer.c:1363
[ 2.129388] Freeing unused kernel memory: 2024K │#3 __run_timers (base=<optimized out>) at kernel/time/timer.c:1666
[ 2.139370] Freeing unused kernel memory: 1284K │#4 run_timer_softirq (h=<optimized out>) at kernel/time/timer.c:1692
[ 2.246231] EXT4-fs (vda): warning: mounting unchecked fs, running e2fsck isd│#5 0xffffffff81a000cc in __do_softirq () at kernel/softirq.c:285
[ 2.259574] EXT4-fs (vda): re-mounted. Opts: block_validity,barrier,user_xatr│#6 0xffffffff810577cc in invoke_softirq () at kernel/softirq.c:365
hello S98 │#7 irq_exit () at kernel/softirq.c:405
│#8 0xffffffff818021ba in exiting_irq () at ./arch/x86/include/asm/apic.h:541
Apr 15 23:59:23 login[49]: root login on 'console' │#9 smp_apic_timer_interrupt (regs=<optimized out>)
hello /root/.profile │ at arch/x86/kernel/apic/apic.c:1052
# insmod /timer.ko │#10 0xffffffff8180190f in apic_timer_interrupt ()
[ 6.791945] timer: loading out-of-tree module taints kernel. │ at arch/x86/entry/entry_64.S:857
# [ 7.821621] 4294894248 │#11 0xffffffff82003df8 in init_thread_union ()
[ 8.851385] 4294894504 │#12 0x0000000000000000 in ?? ()
│(gdb)
Whenever Ciro Santilli learns about molecular biology, he can't help but to feel that it feels like programming, and notably systems programming and computer hardware design.
In some sense, the comparison is obvious: DNA is clearly a programmable medium like any assembly language, but still, systems programming did give Ciro some further feelings.
- The most important analogy perhaps is observability, or more precisely the lack of it. For the computer, this is described at: The lower level you go into a computer, the harder it is to observe things.And then, when Ciro started learning a bit about biology techniques, he started to feel the exact same thing.For example when he played with E. Coli Whole Cell Model by Covert Lab, the main thing Ciro felt was: it is going to be hard to verify any of this data, because it is hard/impossible to know the concentration of each element in a cell as a function of time.More generally of course, this is exactly why making any biology discovery is so hard: we can't easily see what's going on inside the cell, and have to resort to indirect ways of doing so..This exact idea was highlighted by I should have loved biology by James Somers:
For a computer scientist, a biologist's methods can seem insane; the trouble comes from the fact that cells are too small, too numerous, too complex to analyze the way a programmer would, say in a step-by-step debugger.
And then just like in software, some of the methods biologists use to overcome the lack of visibility have direct software analogues:- add instrumentation to cells, e.g. GFP tagging comes to mind
- emulation, e.g. E. Coli Whole Cell Model by Covert Lab
- The boot process is another one. E.g. in x86 the way that you start in 16-bit mode, largely compatible into the 70's, then move to 32-bit and finally 64, does feel a lot the way a earlier stages of embryo development looks more and more like more ancient animals.
Ciro likes to think that maybe that is why a hardcore systems programmer like Bert Hubert got into molecular biology.
Some other people who mention similar things:
- I should have loved biology by James Somers highlights the computer abstraction layer analogy between the two:
The interface is a bit annoying, but the tool is really cool.
100 cycles of
matrixprod
:stress-ng -c1 --cpu-ops 100 --cpu-method matrixprod
man stress-ng
gives the list of possible --cpu-method
. It documents matrixprod
as:matrix product of two 128 × 128 matrices of double floats. Testing on 64 bit x86 hardware shows that this is provides a good mix of memory, cache and floating point operations and is probably the best CPU method to use to make a CPU run hot.
If you don't specify the
--cpu-method
it apparently loops through every method one by one.Limit time to 1s instead of limiting cycles:
stress-ng -c1 -t1 --cpu-method matrixprod
This tutorial explains the very basics of how paging works, with focus on x86, although most high level concepts will also apply to other instruction set architectures, e.g. ARM.
The goals are to:
- demonstrate minimal concrete simplified paging examples that will be useful to those learning paging for the first time
- explain the motivation behind paging
This tutorial was extracted and expanded from this Stack Overflow answer.