lighteval
52d3d333 - Parametrizing the sampling evals from the CLI (#926)

Commit
146 days ago
Parametrizing the sampling evals from the CLI (#926) This PR does several things. Contrain the Metric object creation. We go from f1_score_macro = CorpusLevelMetric( metric_name="f1", sample_level_fn=GenerativePreparator().prepare, category=SamplingMethod.GENERATIVE, corpus_level_fn=CorpusLevelF1Score(average="macro").compute, higher_is_better=True, ) to f1_score_macro = CorpusLevelMetric( metric_name="f1", sample_level_fn=GenerativePreparator(), category=SamplingMethod.GENERATIVE, corpus_level_fn=CorpusLevelF1Score(average="macro"), higher_is_better=True, ) sample_level_fn must derive from either a SampleLevelComputation or Preparator class. The former must implement a compute method, the latter a prepare one. corpus_level_fn either is a function (np.mean and so on), or a CorpusLevelComputation which must implement a .compute_corpus class. All metrics with parametrizable sample_level_fn can now be parametrized at CLI call, example: "lighteval|math_500@k=1|0|0" (user can also use normalization function names if correctly defined in the normalization file). Corpus level parametrization is not supported but could probably be if we choose another symbol. This parametrization of Metrics.MyMetric relies on a trick in making the enum callable. The metric list has been simplified to remove duplicate metrics which were only different by some params All tasks have therefore been changed to use the new metrics names. The test suite has been updated
Author
Parents
Loading