Pull Requests huggingface/lighteval

Fix caching logic

#994 opened 2025-09-25 22:05 by jxmorris12

Fix deberta overflow error bug

#990 opened 2025-09-24 07:14 by amstu2

run slow tests aginst vllm and transformers main

#985 opened 2025-09-23 08:55 by NathanHB

vllm 0.10.2 makes integration tests fail bug ignore-for-release

#980 opened 2025-09-22 13:33 by NathanHB

added HELMET task with json datasets and tests new-task

#971 opened 2025-09-17 01:21 by nayana1729

Add ChartQA new-task

#954 opened 2025-09-11 01:31 by 0xjunhao

adds `concurrent_requests` parameter for litellm backend enhancement

#953 opened 2025-09-10 21:32 by anupam-dewan

[RFC] Rework the suites to be inferred feature refacto breaking

#952 opened 2025-09-09 15:32 by LysandreJik

Fix reading results example code snippet

#942 opened 2025-08-29 14:22 by mariagrandury

Initial implementation for chat template parameters

#904 opened 2025-08-06 09:41 by LysandreJik

Debug continuous batching

#900 opened 2025-08-05 11:11 by clefourrier

Fix #742

#860 opened 2025-07-14 19:06 by mcleish7

Save and output number of samples of each task

#851 opened 2025-07-03 18:26 by itsmejul

Multilingual Overhaul

#833 opened 2025-06-25 21:08 by hynky1999

Adding a range of multilingual evals

#832 opened 2025-06-25 14:32 by clefourrier

Add support for vLLM KV-cache quantization

#773 opened 2025-05-22 10:08 by eldarkurtic

Add Chinese (zh) Translation of Documentation documentation

#744 opened 2025-05-19 11:43 by CassiopeiaCode

Adds RULER benchmark new-task

#722 opened 2025-05-15 11:45 by NathanHB

[WIP] Fix nanotron compatibility

#706 opened 2025-05-06 17:17 by duynht

added vllm lora support. feature

#611 opened 2025-03-11 05:07 by haizhou-shi

Add draft functionality for a generic sandboxed code running

#580 opened 2025-02-21 12:44 by plaguss

Add CodeElo new-task

#575 opened 2025-02-19 07:49 by plaguss

Adding verbal-reasoning-challenge as a Community Task

#551 opened 2025-02-11 20:35 by aryawu0513

Improved stability of litellm models for reasoning models.

#538 opened 2025-02-05 14:47 by JoelNiklaus

Multi node vLLM

#530 opened 2025-01-31 13:53 by ncassereau

Fix attribute and parameter names in loggers

#476 opened 2024-12-26 09:47 by albertvillanova

feat: add JGLUE tasks

#469 opened 2024-12-19 16:15 by ryan-minato

Add swiss legal evals as new community tasks

#389 opened 2024-11-11 11:03 by JoelNiklaus