AI Mathematical Olympiad 2025-11-30
Not too exciting because of the high school knowledge olympiad level, but respectable.
- Every problem has one final integer answer:Also unlike Project Euler and like IMO, all only limited computations are required, i.e. you are not expected to do full blown program generation to reach a final answer. Which makes this further less exciting.
Ciro Santilli's e-soulmates Updated 2025-12-25
These are people which Ciro never met personally, and who might not know that Ciro exists, or might never had any direct 1-2-1 online contact with Ciro, but Ciro is convinced are his brothers in some other dimension due to how many opinions or behaviours he feels they share:
- Dan Dascalescu due to articles such as:
- English as a universal language by Dan Dascalescu (2008)
- www.reddit.com/r/TheoryOfReddit/comments/9oujwf/why_archiving_old_threads_is_a_bigger_problem/ see also online forums that lock threads after some time are evil
- web.archive.org/web/20130922192354/http://wiki.dandascalescu.com/reviews/online_services/web_page_archiving see also web archiving
- random posts on OpenStreetMap, and about China: help.openstreetmap.org/questions/29300/legality-status-of-mapping-activity-in-china?page=1&focusedAnswerId=42167#42167
- kenorb see also Ciro Santilli's Stack Overflow contributions
- Gwern Branwen
- From the LLM Mathematics scene, especially solving Project Euler problems openly online:
Ciro sometimes ponders why is it so hard to find people online that you truly love and admire. Maybe it is for similar reasons why it is also hard in the real world: the great variety of human interest, and the great limitation of our attention spans. But online, where we have access to "everyone", shouldn't it should be easier? Not naturally finding such people is perhaps one of the greatest failings of our education system.
FrontierMath Created 2025-02-11 Updated 2025-11-21
Paper: arxiv.org/abs/2411.04872
arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/ mentions what the official website is unable to clearly state out:
The design of FrontierMath differs from many existing AI benchmarks because the problem set remains private and unpublished to prevent data contamination
The expected answer output for all problems is one single SymPy expression, which is kind of a cool approach which allows either for large integers like Project Euler, but also for irrational expressions to be given, e.g. "An optimization problem in BMO space" from the sample problems has answer:Of course, when the output is not an integer, this leads to the question of simplification equivalence questions. Also, like Project Euler, solutions essentially expect you to write and execute code.
The most interesting aspect of this benchmark is the difficulty. Mathematical olympiad coach Evan Chen comments:[ref]
Problems in [the International Mathematical Olympiad] typically require creative insight while avoiding complex implementation and specialized knowledge [but for FrontierMath] they keep the first requirement, but outright invert the second and third requirement
ORCA Benchmark Created 2025-11-19 Updated 2025-11-30
This one doesn't seem to exciting to be honest, but it might be useful. Sample question:and it expects the correct answer down to the cents:It should be noted that Project Euler has such "precision matters" problems.
53892.27
Project Euler Created 2025-03-20 Updated 2025-12-25
They don't have an actual online judge system, all problems simply have an integer or floating point solution and they just check that you've found the value.
The only metric that matters is who solved the problem first after publication, e.g.: projecteuler.net/fastest=454. The "language" in which problems were solved is just whatever the user put in their profile, they can't actually confirm that.
Project Euler problems typically involve finding or proving and then using a lemma that makes computation of the solution feasible without brute force. As such, they live in the intersection of mathematics and computer science.
Repositories of numerical solutions:
Repositories of code solutions:Basically no one ever had the patience to solve them all. What we need is a collaborative solution.
- euler.stephan-brumme.com/ large number of solutions in C++, stopped around 600. Informal permissive license, e.g. at: euler.stephan-brumme.com/243/Asked for a more formal open license at: github.com/stbrumme/euler/issues/7All of my solutions can be used for any purpose and I am in no way liable for any damages caused.
- www.ivl-projecteuler.com/home 330+ solutions in Python as of 2025. Random looking problem selection. On GitHub: github.com/igorvanloo/Project-Euler-Explained under Unlicense license, a public domain license.
- www.nayuki.io/page/project-euler-solutions. Large number of solutions, mostly in Java and Python primarily but also Mathematica and Haskell sometimes. Proprietary license.
Problems are under CC BY-NC-SA: projecteuler.net/copyright
Once you solve a problem, you can then access its "private" forum thread: projecteuler.net/thread=950 and people will post a bunch of code solutions in there.
How problems are chosen:
projecteuler.net says it started as a subsection in mathschallenge.net, and in 2006 moved to its own domain. WhoisXMLAPI WHOIS history says it was registered by domainmonster.com but details are anonymous. TODO: sample problem on mathschallenge.net on Wayback Machine? Likely wouldn't reveal much anyways though as there is no attribution to problem authors on that site.
www.hackerrank.com/contests/projecteuler/challenges holds challenges with an actual judge and sometimes multiple test cases so just printing the final solution number is not enough.
Project Euler as an AI benchmark Created 2025-03-24 Updated 2025-10-14
The beauty of Project Euler is that it would serve both as a AI code generation benchmark and as an AI Math benchmark!
Updates Getting banned from Project Euler Created 2025-10-27 Updated 2025-11-05
I have been banned from Project Euler for life, and cannot login to my previous account projecteuler.net/profile/cirosantilli.pn
The ban happened within 12 hours of me publishing a solution to Project Euler problem 961 github.com/lucky-bai/projecteuler-solutions/pull/94 which was one-shot by a free GPT-5 account as MathArena had alerted me to being possible: matharena.ai/?comp=euler--euler&task=4&model=GPT-5+%28high%29&run=1
The problem leaderboard contains several people solved the problem within minutes of it being released, so almost certainly with an LLM.
The "secret club" mentality is their only blemish, and incompatible with open science.
They should also make sure that LLMs don't one shot their future problems BEFORE publishing them!