= c/inc_loop.c
{file}
<Ubuntu 25.04> GCC 14.2 -O0 x86_64 produces a horrendous:
``
11c8: 48 83 45 f0 01 addq $0x1,-0x10(%rbp)
11cd: 48 8b 45 f0 mov -0x10(%rbp),%rax
11d1: 48 3b 45 e8 cmp -0x18(%rbp),%rax
11d5: 72 f1 jb 11c8 <main+0x7f>
``
To do about 1s on <Ciro Santilli's hardware/P14s> we need 2.5 billion instructions:
``
time ./inc_loop.out 2500000000
``
and:
``
time ./inc_loop.out 2500000000
``
gives:
``
1,052.22 msec task-clock # 0.998 CPUs utilized
23 context-switches # 21.858 /sec
12 cpu-migrations # 11.404 /sec
60 page-faults # 57.022 /sec
10,015,198,766 instructions # 2.08 insn per cycle
# 0.00 stalled cycles per insn
4,803,504,602 cycles # 4.565 GHz
20,705,659 stalled-cycles-frontend # 0.43% frontend cycles idle
2,503,079,267 branches # 2.379 G/sec
396,228 branch-misses # 0.02% of all branches
``
With -O3 it manages to fully unroll the loop removing it entirely and producing:
``
1078: e8 d3 ff ff ff call 1050 <strtoll@plt>
}
107d: 5a pop %rdx
107e: c3 ret
``
to is it smart enough to just return the return value from strtoll directly as is in `rax`.
Back to article page