CPython JIT 2025-09-09
Added in CPython 3.13.
To enable tested on Ubuntu 25.04:
git clone https://github.com/python/cpython
cd cpython
git checkout v3.13.7
./configure --enable-experimental-jit
make -j
We can try to test it with python/inc_loop.py:
time ./python python/inc_loop.py 10000000
but the result is just as pathetic as without JIT currently, taking about 1 second for only 10m loops.
This can be compared with the optimal assembly from c/inc_loop_asm.c:
time ./inc_loop_asm.out 1000000000
which does 1 billion loops in about half a second on P14s.
For comparison, PyPy actually speeds things up and does 1 billion loops in about a second, so only 2x worse than native.
TODO triple check that JIT is enabled. Many threads say the command is:
./python -c 'import sysconfig; sysconfig.get_config_var("JIT_DEPS")'
but that fails with:
ModuleNotFoundError: No module named '_sysconfigdata__linux_x86_64-linux-gnu'
For comparison with a properly implemented dynamic language JIT running nodejs/inc_loop.js does 1 billion loops in 0.6s on v22.14.0, close to native.
tonybaloney.github.io/posts/python-gets-a-jit.html documents what the initial "JIT" implementation does. It is just an extremely naive concatenation of instructions that avoids a for + switch. No wonder it doesn't speed things up much at all.
Project Euler problem 943 Created 2025-10-27 Updated 2025-10-27
A naive T in Python is:
from collections import deque

def T(a: int, b: int, N: int) -> int:
    total = a
    q = deque([a] * (a - 1))
    is_a = False
    for i in range(N - 1):
        cur = q.popleft()
        total += cur
        q.extend([a if is_a else b] * cur)
        is_a = not is_a
    return total

assert T(2, 3, 10) == 25
assert T(4, 2, 10**4) == 30004
assert T(5, 8, 10**6) == 6499871
which passes the tests, but takes half a second on PyPy. So clearly it is not going to work for 22332223332233 which has 14 digits.
Maybe if T is optimized enough, then we can just bruteforce over the ~40k possible sum ranges 2 to 223. 1 second would mean 14 hours to do them all, so bruteforce but doable. Otherwise another optimization step may be needed at that level as well: one wonders if multiple sums can be factored out, or if the modularity can of the answer can help in a meaningful way. The first solver was ecnerwala using C/C++ in 1 hour, so there must be another insight missing, unless they have access to a supercomputer.
The first idea that comes to mind to try and optimize T is that this is a dynamic programming, but then the question is what is the recurrence relation.
The sequence appears to be a generalization of the Kolakoski sequence but to numbers other than 1 and 2.
maths-people.anu.edu.au/~brent/pd/Kolakoski-ACCMCC.pdf "A fast algorithm for the Kolakoski sequence" might provide the solution, the paper says:
It is conjectured that the algorithm runs in time and space , where
and provides exactly a recurrence relation and a dynamic programming approach.
www.reddit.com/r/algorithms/comments/8cv3se/kolakoski_sequence/ asks for an implementation but no one gave anything. Dupe question: math.stackexchange.com/questions/2740997/kolakoski-sequence contains an answer with Python and Rust code but just for the original 1,2 case.
github.com/runbobby/Kolakoski has some C++ code but it is not well documented so it's not immediately easy to understand what it actually does. It does appear to consider the m n case however.