lighteval
9288bd84 - Refacto and remove bloated code (#709)

Commit
173 days ago
Refacto and remove bloated code (#709) ## What does this PR do? This PR gives the prompt building logic in lighteval a much-needed spring cleaning The main goal: ditch legacy bloat, make things less painful for users and contributors, and unlock support for more complex benchmarks 🔥 ### Highlights - **Prompt Manager Overhaul:** Each model now owns its own PromptManager instance, with custom params for every flavor of prompt (multimodal, API, multiturn, you name it). - **system-prompt**: now part of the model config - **use-chat-template**: now part of model config - **Metrics Slimdown:** Metrics now only care about `samplingMethod` (generative or loglikelihood). Say goodbye to `use_case` and all those old request types. - **Request Layer Gone:** Models get the raw `Doc` directly -—no more unnecessary `request` wrappers that were bloating the code. - **Unified ModelResponse:** All models return a single `ModelResponse` type, whether generative or loglikelihood. This means simpler logging and metric computation. - **Consistent Metric Signatures:** Every metric now uses the same function signature: `compute(doc: Doc, model_response: ModelResponse)`. - **Standardized Details:** Each sample’s details now always include three fields: doc, metric, and model_response. - **Generative Metrics Unified:** All generative metrics now work the same way. If users want greedy generation, they need to set temperature to 0. **Exception will be raised if the user tries to run a sampling metric with temp = 0** - **Removed Loglikelihood Single Token:** bloated and almost not used - **Tests:** All tests pass, and no changes were needed to expected values. ### Why? - Less code, fewer headaches. - Easier to add new benchmarks (including weird and wonderful ones). - More user-friendly inspection tools. - A single, unified way to handle prompts, responses, and metrics. --------- Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: clementine@huggingface.co <clementine@huggingface.co>
Author
Parents
Loading