OurBigBook About$ Donate
 Sign in+ Sign up
by Ciro Santilli (@cirosantilli, 37)

BigCodeBench (2024)

 ... Information technology Computer Software Compiler Automatic programming AI code generation benchmark
 0 By others on same topic  0 Discussions  Updated 2025-05-26  +Created 2025-03-20  See my version
  • github.com/bigcode-project/bigcodebench
  • bigcode-bench.github.io/
  • arxiv.org/abs/2406.15877
Their most interesting subset, the -hard one, appears to be present at: huggingface.co/datasets/bigcode/bigcodebench-hard in Parquet format. OMG why.
The tests make free usage of the Python standard library and other major external libraries, e.g. huggingface.co/datasets/bigcode/bigcodebench-hard/viewer/default/v0.1.0_hf?views%5B%5D=v010_hf&row=0 uses FTPlib. Kind of cool.
They even test graph plotting? huggingface.co/datasets/bigcode/bigcodebench-hard/viewer/default/v0.1.0_hf?views%5B%5D=v010_hf&row=11 How does it evaluate?

 Ancestors (9)

  1. AI code generation benchmark
  2. Automatic programming
  3. Compiler
  4. Software
  5. Computer
  6. Information technology
  7. Area of technology
  8. Technology
  9.  Home

 View article source

 Discussion (0)

+ New discussion

There are no discussions about this article yet.

 Articles by others on the same topic (0)

There are currently no matching articles.
  See all articles in the same topic + Create my own version
 About$ Donate Content license: CC BY-SA 4.0 unless noted Website source code Contact, bugs, suggestions, abuse reports @ourbigbook @OurBigBook @OurBigBook