We introduce Putnam-AXIOM, a benchmark of 522 university-level competition problems drawn from the prestigious William Lowell Putnam Mathematical Competition, and Putnam-AXIOM Variation, an unseen companion set of 100 functional variants generated by programmatically perturbing variables and constants.
How they "ensure" that models are not contaminated:
Most of their problems come from high school knowledge olympiads and they are therefore completely irrelevant for 2025 LLMs.
- youtu.be/dX4vCNi544w?t=1866 this section about numerical simulation is quite interesting. They don't know how to solve the inverse problem, so they just simulated a bunch of different mass combinations
- youtu.be/dX4vCNi544w?t=2290 they use Ubuntu everywhere, default purple desktop. God bless. It's good enough for gravitational wave detection, just not good enough for some shitty enterprise application. Lol.
The Collatz function is not very elegant in that the odd case is always even because is odd, so it is always predictably followed by a division by two. This is not the case for the even case, where the result can be either even or odd.
Unlisted articles are being shown, click here to show only listed articles.