MathArena Apex 2025-12-13
A subsets of problems that they curate from competitions.
The extreme overfitting case of training is to have a map where each input leads to one output.
However it is cool that this overfit does not allow you to compute the final input for which there is no known output.
This therefore forces the creation of more general solution rules.
While in some cases solutions can work for any input, in many others they require specific assumptions about input, but the model could simply check that the assumptions apply to all inputs and use them for the final algorithm.
Verina 2025-12-13
AI code generation benchmark in which part of the benchmark includes producing a formal Lean proof of the implementation. Sweet.
Principia Labs 2025-12-13
www.principialabs.org
We combine large-scale pretraining with reinforcement learning to create models that can rederive and learn from the entire corpus of human mathematics. Our goal is automated mathematical discovery: AI that does the creative, generative work that was previously only possible for the world's best researchers—and can be deployed on the hardest problems in science and engineering.
www.math.inc/careers
Suppose that today is June 1, 2025. We call a date "square" if all of its components (day, month, and year) are perfect squares. I was born in the last millennium, and my next birthday (relative to that date) will be the last square date in my life. If you sum the square roots of the components of that upcoming square birthday (day, month, year), you obtain my age on June 1, 2025. My mother would have been born on a square date if the month were a square number; in reality it is not a square date, but both the month and day are perfect cubes. When was I born, and when was my mother born?
One shot by GPT-5.1, possibly contaminated obviously:
You were born on 25 September 1971.
Your mother was born on 1 August 1936.
Axiom Math 2025-12-13
Not to be confused with tutoring company "Axiom Maths" which shows on top of Google results: axiommaths.com/ lol fuck.

There are unlisted articles, also show them or only show them.