<span title="Home" class="fa-solid-900 icon"></span> Home

Ciro Santilli's Homepage

Technology

Area of technology

Information technology

Computer

Machine learning

Artificial intelligence

AI by capability

Generative AI

Generative AI by modality

AI text generation

Text-to-text model

Large language model

LLM benchmark

List of LLM benchmarks

{c}
{tag=Math AI benchmark}
{title2=2025}
{wiki}

Contains highly specialized questions in various academic fields, including <mathematics>. The problems are answered either with a number, or multiple choice, or free text.

* https://arxiv.org/abs/2501.1424
* https://huggingface.co/datasets/cais/hle
* https://agi.safe.ai/


Humanity's Last Exam

Humanity's Last Exam (2025)

 Ancestors (15)

 Discussion (0)

 Articles by others on the same topic (0)

Humanity's Last Exam (2025)

 Ancestors (15)

 Discussion (0)  Subscribe (1)

 Articles by others on the same topic (0)

 Discussion (0)