lighteval
52d3d333 - Parametrizing the sampling evals from the CLI (#926)

Commit

146 days ago

Parametrizing the sampling evals from the CLI (#926) This PR does several things. Contrain the Metric object creation. We go from f1_score_macro = CorpusLevelMetric( metric_name="f1", sample_level_fn=GenerativePreparator().prepare, category=SamplingMethod.GENERATIVE, corpus_level_fn=CorpusLevelF1Score(average="macro").compute, higher_is_better=True, ) to f1_score_macro = CorpusLevelMetric( metric_name="f1", sample_level_fn=GenerativePreparator(), category=SamplingMethod.GENERATIVE, corpus_level_fn=CorpusLevelF1Score(average="macro"), higher_is_better=True, ) sample_level_fn must derive from either a SampleLevelComputation or Preparator class. The former must implement a compute method, the latter a prepare one. corpus_level_fn either is a function (np.mean and so on), or a CorpusLevelComputation which must implement a .compute_corpus class. All metrics with parametrizable sample_level_fn can now be parametrized at CLI call, example: "lighteval|math_500@k=1|0|0" (user can also use normalization function names if correctly defined in the normalization file). Corpus level parametrization is not supported but could probably be if we choose another symbol. This parametrization of Metrics.MyMetric relies on a trick in making the enum callable. The metric list has been simplified to remove duplicate metrics which were only different by some params All tasks have therefore been changed to use the new metrics names. The test suite has been updated

References

#926 - Parametrizing the sampling evals from the CLI

Author

clefourrier

Parents

50756ef2

lighteval 52d3d333 - Parametrizing the sampling evals from the CLI (#926)

lighteval
52d3d333 - Parametrizing the sampling evals from the CLI (#926)