I've created a quick fork of ARC-DSL which defines a hand crafted Domain Specific Language (DSL) approach to help solve ARC-AGI problems.
I basically just merged outstanding pull requests on the original repo that were needed to make things run.
It would be cool to see if those rules also solve ARC-AGI-2 problems well, but lazy now.
ARC-AGI-2 is a very interesting benchmark which mixes some symbolic and other visual elements, and is readily solvable by non-expert humans, but has so far resisted transformers to a large degree.
Part of me would like to focus more on less visual aspects of AI, but it is still of interest.
It is funny how many early (semi)-retired fintech/bigtech bros that are interested in the project, I saw several of them on the forums.
I'd be tempted if I were in that position too I must confess. Maybe in 15 years time for me the way things are looking.
Kudos to these people who do something cool and open when they don't need money: www.reddit.com/r/Fire/comments/15x4w7r/comment/jx7dn16/ It is also the case of Jimmy Wales from Wikipedia for example, who used to work in finance.
New Zealand Flag Debate by
. Source. Why New Zealand Fired its Official Wizard by Qxir
. Source. It seems to have been tested on something older than Ubuntu 24.04, as 24.04 install requires some porting, started process at: github.com/cirosantilli/ARC-AGI-solution/tree/ubuntu-24-04 but gave up to try Ubuntu 22.04 instead.
Ubuntu 22.04 Docker install worked without patches, after installing Poetry e.g. to try and solve 1ae2feb7:but towards the end we have:so it failed.
git clone https://github.com/aviad12g/ARC-AGI-solution
cd ARC-AGI-solution
git checkout f3283f727488ad98fe575ea6a5ac981e4a188e49
poetry install
git clone https://github.com/arcprize/ARC-AGI-2
`poetry env activate`
export PYTHONPATH="$PWD/src:$PYTHONPATH"
python3 -m arc_solver.cli.main solve ARC-AGI-2/data/evaluation/1ae2feb7.json{
"success": false,
"error": "Search failed: no_multi_example_solution",
"search_stats": {
"nodes_expanded": 21,
"nodes_generated": 903,
"termination_reason": "no_multi_example_solution",
"candidates_generated": 25,
"examples_validated": 3,
"validation_success_rate": 0.0,
"multi_example_used": true
},
"predictions": [
null,
null,
null
],
"computation_time": 30.234344280001096,
"task_id": "1ae2feb7",
"task_file": "ARC-AGI-2/data/evaluation/1ae2feb7.json",
"solver_version": "0.1.0",
"total_time": 30.24239572100123,
"timestamp": 1760353369.9701269
}
Task: 1ae2feb7.json
Success: False
Error: Search failed: no_multi_example_solution
Multi-example validation: ENABLED
Training examples validated: 3
Candidates generated: 25
Validation success rate: 0.0%
Computation time: 30.23s
Total time: 30.24sLet's see if any of them work at all as advertised:and at the end:has only 7 successes.
ls ARC-AGI-2/data/evaluation/ | xargs -I'{}' python3 -m arc_solver.cli.main solve 'ARC-AGI-2/data/evaluation/{}' |& tee tmp.txtgrep 'Success: True' tmp.txt | wcAlso weirdly only has 102 hits, but there were 120 JSON tasks in that folder. I search for the missing executions:The first missing one is 135a2760, it blows up with:and grepping ERROR gives us:Reported at: github.com/aviad12g/ARC-AGI-solution/issues/1
grep 'Success: True' tmp.txt | wcdiff -u <(grep Task: tmp.txt | cut -d' ' -f2) <(ls ARC-AGI-2/data/evaluation)ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializableERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type SizePredicate is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type ndarray is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type ndarray is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type VerticalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type VerticalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type ndarray is not JSON serializable
ERROR: Solve command failed: Object of type VerticalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type ndarray is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type HorizontalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type VerticalLinePredicate is not JSON serializable
ERROR: Solve command failed: Object of type VerticalLinePredicate is not JSON serializableA(x) = x + 1
Z(u)(v) = v
S(u)(v)(w) = v(u(v)(w))S
(S)
(S(S))
(S(Z))
(A)
(0)
S
(S)
(
S
(S(S))
(S(Z))
)
(A)
(0)
S
(S(S))
(S(Z))
(
S
(
S
(S(S))
(S(Z))
)
(A)
)
(0)
S
(Z)
(
S(S)
(S(Z))
(
S
(
S
(S(S))
(S(Z))
)
(A)
)
)
(0)
S(S)
(S(Z))
(
S
(
S
(S(S))
(S(Z))
)
(A)
)
(
Z
(
S(S)
(S(Z))
(
S
(
S
(S(S))
(S(Z))
)
(A)
)
)
(0)
)
S
(S)
(S(Z))
(
S
(
S
(S(S))
(S(Z))
)
(A)
)
(0)So we see that all of these rules resolve quite quickly and do not go into each other.
S however offers some problems, in that:C_0 = Z
C_i = S(C_{i-1})
D_i = C_i(S)(S)Calculate the nine first digits of:
D_a(D_b)(D_c)(C_d)(A)(e)Removing
D_a:S^i(Z)S)(S)(D_b)(D_c)(C_d)(A)(e)Solution:
233168Solutions to the ProjectEuler+ version:
The original can be found with:
printf '1\n1000\n' | euler/1.pyThis was a registration CAPTCHA problem as of 2025:Python solution:At: euler/0.py
Among the first 510 thousand square numbers, what is the sum of all the odd squares?
s = 0
for i in range(1, 510001, 2):
s += i*i
print(s)As mentioned at euler.stephan-brumme.com these tend to be harder, as they have their own judge system that actually runs programs, and therefore can test input multiple test cases against their reference implementation rather than just hard testing the result for a single input.
Goes only up to Project Euler problem 254 as of 2025, which had been published much much earlier, in 2009, so presumably they've stopped there.
There are unlisted articles, also show them or only show them.