Ubuntu 25.04 GCC 14.2 -O0 x86_64 produces a horrendous:To do about 1s on P14s we need 2.5 billion instructions:and:gives:
11c8: 48 83 45 f0 01 addq $0x1,-0x10(%rbp)
11cd: 48 8b 45 f0 mov -0x10(%rbp),%rax
11d1: 48 3b 45 e8 cmp -0x18(%rbp),%rax
11d5: 72 f1 jb 11c8 <main+0x7f>
time ./inc_loop.out 2500000000
time ./inc_loop.out 2500000000
1,052.22 msec task-clock # 0.998 CPUs utilized
23 context-switches # 21.858 /sec
12 cpu-migrations # 11.404 /sec
60 page-faults # 57.022 /sec
10,015,198,766 instructions # 2.08 insn per cycle
# 0.00 stalled cycles per insn
4,803,504,602 cycles # 4.565 GHz
20,705,659 stalled-cycles-frontend # 0.43% frontend cycles idle
2,503,079,267 branches # 2.379 G/sec
396,228 branch-misses # 0.02% of all branches
With -O3 it manages to fully unroll the loop removing it entirely and producing:to is it smart enough to just return the return value from strtoll directly as is in
1078: e8 d3 ff ff ff call 1050 <strtoll@plt>
}
107d: 5a pop %rdx
107e: c3 ret
rax
.This is the only way that we've managed to reliably get a single
inc
instruction loop, by using inline assembly, e.g. on we do x86:loop:
inc %[i];
cmp %[max], %[i];
jb loop;
For 1s on P14s Ubuntu 25.04 GCC 14.2 -O0 x86_64 we need about 5 billion:
time ./inc_loop_asm.out 5000000000
This is a quick Microarchitectural benchmark to try and determine how many functional units our CPU has that can do an
inc
instruction at the same time due to superscalar architecture.The generated programs do loops like:with different numbers of inc instructions.
loop:
inc %[i0];
inc %[i1];
inc %[i2];
...
inc %[i_n];
cmp %[max], %[i0];
jb loop;
c/inc_loop_asm_n.sh results for a few CPUs
. Quite clearly:and both have low instruction count effects that destroy performance, AMD at 3 and Intel at 3 and 5. TODO it would be cool to understand those better.
- AMD 7840U can run INC on 4 functional units
- Intel i7-7820HQ can run INC on 2 functional units
Data from multiple CPUs manually collated and plotted manually with c/inc_loop_asm_n_manual.sh.
Articles by others on the same topic
There are currently no matching articles.