lighteval
8b66e16d - Use `n=16` samples to estimate `pass@1` for AIME benchmarks (#661)

Commit
263 days ago
Use `n=16` samples to estimate `pass@1` for AIME benchmarks (#661) * Use n=16 samples to estimate pass@1 for AIME benchmarks * Remove other metrics
Author
Parents
Loading