lighteval
Add docstring docs
#413
Merged

Add docstring docs #413

albertvillanova
albertvillanova157 days ago (edited 156 days ago)❀ 1

Add docstring docs.

I see this PR as setting up the docs for classes/functions using their docstrings.

Future PRs could add missing docstrings and improve the existing ones, if needed.

albertvillanova Add Reference docs with Pipeline docs
df27a43b
albertvillanova albertvillanova marked this pull request as draft 157 days ago
albertvillanova
albertvillanova157 days ago (edited 157 days ago)πŸ‘ 1

There is an incompatibility issue within the lighteval[dev] environment while generating the docs:

  • thinc-8.2.4 requires numpy<2
  • numpy-2.1.3 is installed
  • apparently there is a package that removed the requirement numpy<2

https://github.com/huggingface/lighteval/actions/runs/12084402368/job/33699519327?pr=413

   File "thinc/backends/numpy_ops.pyx", line 1, in init thinc.backends.numpy_ops
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

I pin numpy < 2.

albertvillanova Pin numpy<2
afb0ce2c
HuggingFaceDocBuilderDev
HuggingFaceDocBuilderDev157 days ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

albertvillanova
albertvillanova157 days ago

Preliminary docs with docstrings (well, we need to add the docstrings): https://moon-ci-docs.huggingface.co/docs/lighteval/pr_413/en/package_reference/pipeline

albertvillanova Add Tasks docs
696416f7
albertvillanova Add more Tasks docs
89b25815
albertvillanova Add Models docs
77f779ab
albertvillanova Fix Models docs
aaeee7b1
albertvillanova Remove AdapterModel that requires peft
e46b1d3f
albertvillanova Remove NanotronLightevalModel and VLLMModel that require nanotron and…
17c8088e
albertvillanova Fix markdown comment syntax
ee937bbe
albertvillanova Add Metrics docs
6ad6a2fc
albertvillanova Fix typo
7874f1bd
albertvillanova Remove Main classes section
bb1a20ac
albertvillanova Add Datasets docs
d281f10f
albertvillanova Create Main classes section with Pipeline
812ef353
albertvillanova Add EvaluationTracker docs
632e89b0
albertvillanova Add ModelConfig docs
3e53ddb8
albertvillanova Add ParallelismManager to Pipeline docs
ea6af22f
albertvillanova Add inter-links from using-the-python-api
7a413acc
albertvillanova Fix inter-links
4e9c80b6
albertvillanova Add more Metrics docs
9955186c
albertvillanova Comment Metrics enum
82a8fcbb
albertvillanova Fix typo
6a08f011
albertvillanova Add explanation and GH issue to comment in Metrics enum
95ac6d52
albertvillanova Add inter-link to Metrics
5962be66
albertvillanova Add subsection titles to LightevalTask
6eb23480
albertvillanova Add inter-link to LightevalTaskConfig
bb4c95c1
albertvillanova Add inter-link to section heading anchor
7153bfe0
albertvillanova Add more Metrics docs
ae8ce621
albertvillanova Add inter-link to SampleLevelMetric and Grouping
9849a96a
albertvillanova Add inter-link to LightevalTaskConfig
c5250e77
albertvillanova Fix section title with trailing colon
f2ead25e
albertvillanova
albertvillanova156 days ago

Maybe we should rename the section "API" (with available metrics and tasks) to avoid confusion with the "Reference" section (containing the docstrings of classes/functions).

albertvillanova albertvillanova marked this pull request as ready for review 156 days ago
albertvillanova albertvillanova requested a review from clefourrier clefourrier 156 days ago
albertvillanova
albertvillanova156 days ago

Ready for review:

  • I see this PR as setting up the docs for classes/functions using their docstrings.
  • Future PRs could add missing docstrings and improve the existing ones, if needed.
clefourrier
clefourrier approved these changes on 2024-11-30
clefourrier156 days ago

Looking very nice!

Some things (I scrolled the PR but mostly inspected the published docs):

  • models does not have a sidebar, so I suggested a split by provider (transformers, tgi/inference endpoints, nanotron, vllm, etc)
  • I would put the models as one of the main classes since it's also possible to start an evaluation with Pipeline + Tracker + Model,
  • we're missing the doc of the rest of the logging sytem
  • do you know where the doc icon is defined? should be a 🌀️

Not sure about the API name for the lists of possible metrics/tasks - now that we have the doc from the code we should probably use that but let's think about it for another PR

I took a look at the final docs and they are neat (also allows to see very easily where the doc is parsable or messy instead). Tysm for this work!

Conversation is marked as resolved
Show resolved
docs/source/package_reference/models.mdx
1# Models
2
clefourrier156 days ago

I would split into sections to separate:

  • LightevalModel (our abc iirc)
  • BaseModel/DeltaModel/AdapterModel (transformers/accelerate)
  • InferenceEndpoints (allows to launch providers), maybe grouped with ModelClient
  • Nanotron
  • VLLM
albertvillanova153 days ago

Done: b2d82e3

Conversation is marked as resolved
Show resolved
docs/source/using-the-python-api.mdx
22
3Lighteval can be used from a custom python script. To evaluate a model you will
4need to setup an `evaluation_tracker`, `pipeline_parameters`, `model_config`
5and a `pipeline`.
3Lighteval can be used from a custom python script. To evaluate a model you will need to setup an
4[`~logging.evaluation_tracker.EvaluationTracker`], [`~pipeline.PipelineParameters`],
5
[`model_config`](package_reference/model_config) and a [`~pipeline.Pipeline`].
clefourrier156 days ago

Either model config or model since pr #390

albertvillanova153 days ago

Done: 39e145b

Conversation is marked as resolved
Show resolved
docs/source/package_reference/datasets.mdx
1# Datasets
2
clefourrier156 days ago

This I would put in the task as it's mostly batching logic iirc

albertvillanova153 days ago

Done: 83043b3

albertvillanova Add sections to Models docs
b2d82e3e
albertvillanova Move Models docs to Main classes section
c4ea6998
albertvillanova Document you can pass either model or model config to Pipeline
39e145b6
albertvillanova Move Datasets docs to Tasks docs
83043b3f
albertvillanova Add logging docs
0893e55d
albertvillanova
albertvillanova153 days ago
  • models does not have a sidebar, so I suggested a split by provider (transformers, tgi/inference endpoints, nanotron, vllm, etc)

Done for the models that can be made into docs; other models raise an import error when generating their docs and I have commented out their docs entries; I might open an Issue for this.

  • I would put the models as one of the main classes since it's also possible to start an evaluation with Pipeline + Tracker + Model,

Done:

  • Put the models into Main classes.
  • Update the "Using the Python API" docs to indicate that a model can be passed as well.
  • we're missing the doc of the rest of the logging sytem

I added the Logging docs with all the info loggers

  • do you know where the doc icon is defined?

Not done: I don't know. I might also open an Issue to address this in an upcoming PR.

clefourrier
clefourrier153 days agoπŸ‘ 1

Perfect, thanks for your changes, merging :)

clefourrier clefourrier merged 9bfa1ead into main 153 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone