Pull Requests huggingface/lighteval

Add MCQ support to Yourbench evaluation

#734 opened 2025-05-16 07:51 by alozowski

Adds RULER benchmark

#722 opened 2025-05-15 11:45 by NathanHB

Added Flores

#717 opened 2025-05-14 13:31 by clefourrier

Revert the task detail serialization to be compliant with PyArrow

#715 opened 2025-05-11 04:45 by alvin319

update for CB

#714 opened 2025-05-09 11:16 by ArthurZucker

refacto prompt building

#709 opened 2025-05-07 11:59 by NathanHB

[WIP] Fix nanotron compatibility

#706 opened 2025-05-06 17:17 by duynht

Making bootstrap_iters an arg feature/enhancement

#697 opened 2025-05-01 02:57 by pratyushmaini

Async vllm feature/enhancement

#693 opened 2025-04-28 14:28 by clefourrier

Add way to load local datafile feature/enhancement

#687 opened 2025-04-24 19:49 by ScottHoang

Adds multimodal support feature/enhancement

#675 opened 2025-04-15 11:38 by NathanHB

Nanotron model updates feature/enhancement

#652 opened 2025-03-31 09:32 by anton-l

Add CLI arg `generate_until_token` to support reasoning and CoT models

#617 opened 2025-03-17 17:17 by mapmeld

added vllm lora support. feature/enhancement

#611 opened 2025-03-11 05:07 by haizhou-shi

Add draft functionality for a generic sandboxed code running

#580 opened 2025-02-21 12:44 by plaguss

Add CodeElo new-task

#575 opened 2025-02-19 07:49 by plaguss

new metrics and pr-fouras dataset add

#558 opened 2025-02-13 17:05 by BertrandCabotIDRIS

Adding verbal-reasoning-challenge as a Community Task

#551 opened 2025-02-11 20:35 by aryawu0513

Improved stability of litellm models for reasoning models.

#538 opened 2025-02-05 14:47 by JoelNiklaus

Multi node vLLM

#530 opened 2025-01-31 13:53 by ncassereau

Fix `TGI` (Text Generation Inference) Endpoint Inference and TGI JSON Grammar Generation

#502 opened 2025-01-15 14:31 by cpcdoy

Initial proposal for model lazy loading

#497 opened 2025-01-11 21:15 by JoelNiklaus

Added diskcache to base model.

#480 opened 2024-12-30 08:31 by JoelNiklaus

Fix attribute and parameter names in loggers

#476 opened 2024-12-26 09:47 by albertvillanova

Config fixes for VLLMModel

#472 opened 2024-12-20 16:07 by anton-l

feat: add JGLUE tasks

#469 opened 2024-12-19 16:15 by ryan-minato

Use config dict in TaskConfigLogger for easier serialization

#454 opened 2024-12-18 09:58 by albertvillanova

Add swiss legal evals as new community tasks

#389 opened 2024-11-11 11:03 by JoelNiklaus

Adding chat completion task to endpoint models

#281 opened 2024-08-27 08:46 by sadra-barikbin