Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/lighteval
Pull Requests
Commits
Open
Closed
feat(utils): show count of evaluated samples in Markdown summary table
#1188 opened 2026-03-13 06:03 by
anzzyspeaksgit
Add count column to markdown results summary
#1187 opened 2026-03-13 04:12 by
Bortlesboat
Fix typos in math_comparison.py and sample_comparison.py
#1186 opened 2026-03-12 12:25 by
joshuaswanson
squad_v2: include unanswerable questions in evaluation
#1185 opened 2026-03-09 11:01 by
Matteovanypersele
Update vllm version requirement to 0.17.0
#1183 opened 2026-03-09 10:43 by
NathanHB
Fail fast on non-retriable LiteLLM status codes
#1182 opened 2026-03-08 03:58 by
yangbaechu
Redact model config credentials in saved and returned results
#1181 opened 2026-03-08 03:03 by
yangbaechu
fix(normalizations): guard against index out of range in LogProbToken…
#1180 opened 2026-03-06 11:25 by
inakiLakunza
Korean completed and Basque fixed
#1179 opened 2026-03-06 10:46 by
inakiLakunza
[LiteLLM] Add cross-provider reasoning_effort support (first step) + token budget fixes
#1178 opened 2026-03-05 15:42 by
dyurchenko98
Fix LiteLLM split iteration in greedy_until to avoid duplicate API requests
#1177 opened 2026-03-05 14:56 by
dyurchenko98
Fix corpus reference orientation for chrF/chrF++/TER metrics
#1176 opened 2026-03-05 14:40 by
dyurchenko98
FIX : handle empty choices in Doc.get_golds() to prevent IndexError
#1174 opened 2026-02-23 21:20 by
nandeanie
Fix: pass through custom_tasks and enable multilingual in eval command
#1172 opened 2026-02-19 07:58 by
dzautner
Fix IndexError in LogProbTokenNorm when choices_tokens is shorter than choices_logprob
#1171 opened 2026-02-18 16:33 by
worksbyfriday
Add jfinqa: Japanese Financial Numerical Reasoning QA
#1169 opened 2026-02-17 17:59 by
ajtgjmdjp
fix: restore task list display logic
#1166 opened 2026-02-10 08:49 by
s1eeping-king
fix: Transformers Model no template cast stop_sequences to list
#1165 opened 2026-02-07 17:52 by
mrsndmn
Fix TypeError in aa_omniscience_prompt
#1161 opened 2026-01-22 19:41 by
pjavanrood
Fix split loading error in bigbench
#1159 opened 2026-01-22 19:41 by
pjavanrood
Fix CoQA metric and support multi-doc loading
#1157 opened 2026-01-22 19:40 by
pjavanrood
Fix RecursionError in imdb_contrastset_prompt
#1155 opened 2026-01-22 19:40 by
pjavanrood
Fix legal_summarization keys and SummaC metric
#1153 opened 2026-01-22 19:40 by
pjavanrood
Fix non-existent evaluation splits in lextreme
#1151 opened 2026-01-22 19:40 by
pjavanrood
Fix evaluation split config in lsat_qa
#1149 opened 2026-01-22 19:40 by
pjavanrood
Improve NarrativeQA metrics and prompt structure
#1147 opened 2026-01-22 19:40 by
pjavanrood
Fix schema validation in olympiad_bench Doc.specific
#1145 opened 2026-01-22 19:40 by
pjavanrood
Fix key mismatch and context access in PubMedQA
#1143 opened 2026-01-22 19:40 by
pjavanrood
Fix TypeError in real_toxicity_prompts
#1141 opened 2026-01-22 19:39 by
pjavanrood
Fix column mismatch and metric in SimpleQA
#1139 opened 2026-01-22 19:39 by
pjavanrood
Older