AI Mathematical Olympiad 2025-11-30
Not too exciting because of the high school knowledge olympiad level, but respectable.
AtCoder 2026-01-30
Saw this one mentioned on some Project Euler forum threads.
Ciro Santilli's e-soulmates Updated 2025-12-25
Some other idealists that are a bit further out but with some similarities:
Ciro Santilli also things of those people as being part of his 108 Stars of Destiny troupe.
Ciro sometimes ponders why is it so hard to find people online that you truly love and admire. Maybe it is for similar reasons why it is also hard in the real world: the great variety of human interest, and the great limitation of our attention spans. But online, where we have access to "everyone", shouldn't it should be easier? Not naturally finding such people is perhaps one of the greatest failings of our education system.
FrontierMath Created 2025-02-11 Updated 2025-11-21
arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/ mentions what the official website is unable to clearly state out:
The design of FrontierMath differs from many existing AI benchmarks because the problem set remains private and unpublished to prevent data contamination
The expected answer output for all problems is one single SymPy expression, which is kind of a cool approach which allows either for large integers like Project Euler, but also for irrational expressions to be given, e.g. "An optimization problem in BMO space" from the sample problems has answer:
Of course, when the output is not an integer, this leads to the question of simplification equivalence questions. Also, like Project Euler, solutions essentially expect you to write and execute code.
The most interesting aspect of this benchmark is the difficulty. Mathematical olympiad coach Evan Chen comments:[ref]
Problems in [the International Mathematical Olympiad] typically require creative insight while avoiding complex implementation and specialized knowledge [but for FrontierMath] they keep the first requirement, but outright invert the second and third requirement
ORCA Benchmark Created 2025-11-19 Updated 2025-11-30
This one doesn't seem to exciting to be honest, but it might be useful. Sample question:
If I deposit $50,000 at 5% APR, compounded weekly, what will my balance be after 18 months?
and it expects the correct answer down to the cents:
53892.27
It should be noted that Project Euler has such "precision matters" problems.
Project Euler Lean solutions Created 2026-01-30 Updated 2026-02-08
Using Lean or other programmable proof assistants to solve Project Euler is the inevitable collision of two autisms. In particular, using Lean to prove that you have the correct solution, just making a Lean program that prints out the correct solution is likely now trivial as of 2025 by asking an LLM to port a Python solution to the new language.
Some efforts:
Mentions:
In other proof assistants, therefore with similar beauty:
Project Euler problems typically involve finding or proving and then using a lemma that makes computation of the solution feasible without brute force. There is often an obvious brute force approach, but the pick problem sizes large enough such that it is just not fast enough, but the non-brute-force is.
As such, they live in the intersection of mathematics and computer science.
news.ycombinator.com/item?id=7057408 which is mega high on Google says:
I love project euler, but I've come to the realization that its purpose is to beat programmers soundly about the head and neck with a big math stick. At work last week, we were working on project euler at lunch, and had the one CS PhD in our midst not jumped up and explained the chinese remainder theorem to us, we wouldn't have had a chance.
In many cases, the efficient solution involves dynamic programming.
There are also a set of problems which are very numerical analysis in nature and require the approximation of some real number to a given precision. These are often very fiddly as I doubt most people can prove that their chosen hyperparameters guarantee the required precision.
Many problems ask for solution modulo some number. In general, this is only so that C/C++ users won't have to resort to using an arbitrary-precision arithmetic library and be able to fit everything into uint64 instead. Maybe it also helps the judge system slightly having smaller strings to compare. The final modulos usually don't add any insight to the problems.
Updates / Getting banned from Project Euler Created 2025-10-27 Updated 2025-11-05
I have been banned from Project Euler for life, and cannot login to my previous account projecteuler.net/profile/cirosantilli.pn
The problem leaderboard contains several people solved the problem within minutes of it being released, so almost certainly with an LLM.
I'm a huge believer in giving answers to problems, and I take the ban with pride.
It is funny to see that people waste their time policing this kind of useless stuff.
Project Euler likely has many fun problems, and can be a useful machine learning benchmark.
The "secret club" mentality is their only blemish, and incompatible with open science.
They should also make sure that LLMs don't one shot their future problems BEFORE publishing them!