lighteval
Legal NLP tasks on Swiss data
#1032
Merged

Legal NLP tasks on Swiss data #1032

rolshoven
rolshoven rolshoven force pushed from 481bc4f5 to f3a626dc 208 days ago
NathanHB
HuggingFaceDocBuilderDev
rolshoven
NathanHB
NathanHB
rolshoven
rolshoven Legal NLP tasks on Swiss data
83c8a079
rolshoven refactor: split Swiss legal multilingual tasks into modular package
8d19996d
rolshoven refactor: update higher_is_better type in MetricGrouping
4032ce45
rolshoven refactor: Updated prompts and implementation to match the latest SwiL…
ca7e8342
rolshoven refactor: Enhance COMET and GEMBA metric loading with error handling
606417b8
rolshoven Add Gemba dependency for Swiss legal evaluations and remove `suite` p…
8c449c2c
rolshoven Fix batched metric aggregation for grouped metric names
9113d537
rolshoven Fixed missing system prompt
6d3fdf3b
rolshoven Judge models now are used through OpenRouter
bba0e914
rolshoven fix reasoning model token handling when max_tokens is unset
cc22ecb6
rolshoven rolshoven force pushed from ac784b97 to cc22ecb6 71 days ago
rolshoven
rolshoven chore: trigger PR update
a4e0ba13
rolshoven
NathanHB Merge branch 'main' into community_task_slds
406953fd
NathanHB
NathanHB commented on 2026-05-20
NathanHB
NathanHB approved these changes on 2026-05-20
rolshoven fix: return raw score for BLEU, CHRF, and TER metrics instead of scal…
052586aa
rolshoven fix: replaced accidental default value assignment with intended type …
fa341ac9
rolshoven fix: add error handling for unsupported languages in Swiss Landmark D…
011c4047
rolshoven fix: avoid huge negative BERTScore from baseline rescaling
df00dc16
rolshoven
JoelNiklaus JoelNiklaus merged 8d29839e into main 7 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone