lighteval
Add extended task for LiveCodeBench codegeneration
#548
Merged

Add extended task for LiveCodeBench codegeneration #548

plaguss
plaguss Add draft for livecodebench code generation
9369a375
plaguss Add extra argument version_tag
2001e7b7
HuggingFaceDocBuilderDev
plaguss Fix import name
fece5528
plaguss Remove unused typed dict
e46fc2a5
plaguss Checkpoint, not ready yet, try simplifying code running and reuse pas…
6a3c0077
plaguss Add notes for expected values
987eb2a3
plaguss Pass version tag to downloader
42fb0f57
NathanHB
NathanHB
NathanHB commented on 2025-02-11
NathanHB
NathanHB commented on 2025-02-11
NathanHB
NathanHB commented on 2025-02-11
NathanHB
NathanHB commented on 2025-02-11
plaguss Modify helper module and remove dataset version tag
b700dc49
plaguss Remove version_tag
29b2bbe3
plaguss Initial version for lcb:codegeneration
a60e6620
plaguss Remove outdated argument docs
05a7f019
plaguss Remove hardcoded system prompt and pass it via arg
deea663a
plaguss Merge branch 'main' into lcb-codegeneration
a2863f93
plaguss Add kwargs to allow passing other arguments
44f45b5f
plaguss Make generic function to parse the metric name and obtain the number …
127b4cdb
plaguss Change metric name to make it more informative
a372e057
plaguss Add experimental way of passing the number of samples for a metric fr…
53ab4176
plaguss
lewtun
plaguss Add more processes to run the tests
f6a7c4f3
plaguss
plaguss plaguss marked this pull request as ready for review 306 days ago
plaguss Allow reading the generation parameters from the CLI
d6abcd07
plaguss plaguss requested a review from NathanHB NathanHB 305 days ago
plaguss
plaguss Update parsing arguments from CLI
158d660d
plaguss Remove dead code and fix test value
54fa0320
NathanHB
NathanHB
NathanHB approved these changes on 2025-02-17
plaguss Fix num_samples update
4a0fe89b
plaguss Add docs for the new metric_options
f945fdf4
NathanHB NathanHB merged fd479ee6 into main 304 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone