Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/lighteval
Pull Requests
Commits
Open
Closed
Improve NarrativeQA metrics and prompt structure
#1147 opened 2026-01-22 19:40 by
pjavanrood
Fix schema validation in olympiad_bench Doc.specific
#1145 opened 2026-01-22 19:40 by
pjavanrood
Fix key mismatch and context access in PubMedQA
#1143 opened 2026-01-22 19:40 by
pjavanrood
Fix TypeError in real_toxicity_prompts
#1141 opened 2026-01-22 19:39 by
pjavanrood
Fix column mismatch and metric in SimpleQA
#1139 opened 2026-01-22 19:39 by
pjavanrood
Fix subset names in StoryCloze
#1137 opened 2026-01-22 19:39 by
pjavanrood
Fix Doc init and missing metadata in Summarization tasks
#1135 opened 2026-01-22 19:39 by
pjavanrood
Fix hardcoded path in tiny_benchmarks
#1133 opened 2026-01-22 19:39 by
pjavanrood
Fix KeyError in truthful_qa_generative_prompt
#1131 opened 2026-01-22 19:39 by
pjavanrood
Fix MT-Bench multi-turn evaluation logic
#1129 opened 2026-01-22 19:39 by
pjavanrood
Fix specific error in truthfulqa
#1127 opened 2026-01-22 06:22 by
ChenZiHong-Gavin
Support for retriever-augmented models.
#1125 opened 2026-01-19 05:39 by
akshathmangudi
Integrate alyah benchmark
#1117 opened 2026-01-12 06:13 by
amztheorytii
When customizing the save path, modify the "save_details" location
#1092 opened 2025-11-29 09:23 by
Guncuke
fix(tasks): print also tasks not prefixed by the suite name
#1087 opened 2025-11-27 10:56 by
bram-pramono
[EVAL] SciCode
new-task
#1086 opened 2025-11-27 08:02 by
akshathmangudi
Evals on the hub
#1082 opened 2025-11-24 12:42 by
NathanHB
Feature/tvd mi metric
feature
#1080 opened 2025-11-22 00:27 by
zrobertson466920
diskcache for caching
breaking
enhancement
#1068 opened 2025-11-19 10:29 by
f14-bertolotti
graceful shutdown of vllm async
bug
#1064 opened 2025-11-17 13:45 by
f14-bertolotti
remove forbiden caracters in files, caches and details
bug
#1062 opened 2025-11-17 10:08 by
NathanHB
Adds Profbench
new-task
#1041 opened 2025-11-06 12:49 by
NathanHB
Fix PERPLEXITY task
#1037 opened 2025-11-04 19:26 by
ScottHoang
Legal NLP tasks on Swiss data
#1032 opened 2025-10-31 17:54 by
rolshoven
Add support to vllm==0.11.0
#1027 opened 2025-10-22 18:08 by
anmarques
Fixes #1023: add custom processing logic for MetricGrouping
#1025 opened 2025-10-22 01:07 by
colinzuo
Wrap vllm inputs to compatible with VLLM>=0.10.2
#1003 opened 2025-10-02 15:03 by
JIElite
Fix caching logic
#994 opened 2025-09-25 22:05 by
jxmorris12
Fix deberta overflow error
bug
#990 opened 2025-09-24 07:14 by
amstu2
run slow tests aginst vllm and transformers main
#985 opened 2025-09-23 08:55 by
NathanHB
Newer
Older