Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/lighteval
Pull Requests
Commits
Open
Closed
[EVAL] BIG-Bench Extra Hard
#1099 opened 2025-12-05 06:06 by
jgyasu
When customizing the save path, modify the "save_details" location
#1092 opened 2025-11-29 09:23 by
Guncuke
fix(tasks): print also tasks not prefixed by the suite name
#1087 opened 2025-11-27 10:56 by
bram-pramono
[EVAL] SciCode
new-task
#1086 opened 2025-11-27 08:02 by
akshathmangudi
Enable loading data sets from files for custom tasks
#1083 opened 2025-11-24 19:31 by
davebiagioni
Evals on the hub
#1082 opened 2025-11-24 12:42 by
NathanHB
feat: add MathVista benchmark
new-task
#1081 opened 2025-11-22 09:57 by
omkar-334
Feature/tvd mi metric
feature
#1080 opened 2025-11-22 00:27 by
zrobertson466920
[EVAL] MultiChallenge
new-task
#1075 opened 2025-11-21 12:24 by
akshathmangudi
[EVAL] Long Horizon Execution
new-task
#1074 opened 2025-11-21 10:57 by
akshathmangudi
diskcache for caching
breaking
enhancement
#1068 opened 2025-11-19 10:29 by
f14-bertolotti
graceful shutdown of vllm async
bug
#1064 opened 2025-11-17 13:45 by
f14-bertolotti
remove forbiden caracters in files, caches and details
bug
#1062 opened 2025-11-17 10:08 by
NathanHB
Adds Profbench
new-task
#1041 opened 2025-11-06 12:49 by
NathanHB
Fix PERPLEXITY task
#1037 opened 2025-11-04 19:26 by
ScottHoang
Legal NLP tasks on Swiss data
#1032 opened 2025-10-31 17:54 by
rolshoven
Add support to vllm==0.11.0
#1027 opened 2025-10-22 18:08 by
anmarques
Fixes #1023: add custom processing logic for MetricGrouping
#1025 opened 2025-10-22 01:07 by
colinzuo
Wrap vllm inputs to compatible with VLLM>=0.10.2
#1003 opened 2025-10-02 15:03 by
JIElite
Fix caching logic
#994 opened 2025-09-25 22:05 by
jxmorris12
Fix deberta overflow error
bug
#990 opened 2025-09-24 07:14 by
amstu2
run slow tests aginst vllm and transformers main
#985 opened 2025-09-23 08:55 by
NathanHB
vllm 0.10.2 makes integration tests fail
bug
ignore-for-release
#980 opened 2025-09-22 13:33 by
NathanHB
added HELMET task with json datasets and tests
new-task
#971 opened 2025-09-17 01:21 by
nayana1729
Add ChartQA
new-task
#954 opened 2025-09-11 01:31 by
0xjunhao
adds `concurrent_requests` parameter for litellm backend
enhancement
#953 opened 2025-09-10 21:32 by
anupam-dewan
[RFC] Rework the suites to be inferred
feature
refacto
breaking
#952 opened 2025-09-09 15:32 by
LysandreJik
Initial implementation for chat template parameters
#904 opened 2025-08-06 09:41 by
LysandreJik
Debug continuous batching
#900 opened 2025-08-05 11:11 by
clefourrier
Fix #742
#860 opened 2025-07-14 19:06 by
mcleish7
Older