This section is about unofficial ARC-AGI-like problem sets.
These are interesting from both a:
  • practical point of view, as they provide more training data for potential solvers. If you believe that they are representative that is of course.
  • theoretical point of view, as they might help to highlight missing or excessive presumptions of the official datasets
github.com/neoneye/arc-dataset-collection contains a fantastic collection of such datasets, with visualization at: neoneye.github.io/arc/
MathArena Apex 2025-12-13
A subsets of problems that they curate from competitions.
The extreme overfitting case of training is to have a map where each input leads to one output.
However it is cool that this overfit does not allow you to compute the final input for which there is no known output.
This therefore forces the creation of more general solution rules.
While in some cases solutions can work for any input, in many others they require specific assumptions about input, but the model could simply check that the assumptions apply to all inputs and use them for the final algorithm.
Verina 2025-12-13
AI code generation benchmark in which part of the benchmark includes producing a formal Lean proof of the implementation. Sweet.

There are unlisted articles, also show them or only show them.