Cool deeptech ones:
Boring ones:
International ones with a British presence:
ARC-AGI visualization Created 2025-10-14 Updated 2025-10-18
www.kaggle.com/code/allegich/arc-agi-2025-visualization-all-1000-120-tasks contains plots of all questions and answers. It is truly very convenient.
LeanAgent 2025-10-14
They do have a database system which is interesting.
Putnam-AXIOM 2025-10-14
We introduce Putnam-AXIOM, a benchmark of 522 university-level competition problems drawn from the prestigious William Lowell Putnam Mathematical Competition, and Putnam-AXIOM Variation, an unseen companion set of 100 functional variants generated by programmatically perturbing variables and constants.
MathArena 2025-10-14
This project tests various models against various competitions.
How they "ensure" that models are not contaminated:
By evaluating models as soon as new problems are released, we effectively eliminate the risk of contamination
Most of their problems come from high school knowledge olympiads and they are therefore completely irrelevant for 2025 LLMs.
LIGO 2025-10-14
Video 1.
LIGO documentary by Advanced LIGO Documentary Project
. Source.

Unlisted articles are being shown, click here to show only listed articles.