llm-foundry
ac783541 - Adds support for chat formatted finetuning input data. (#884)

Commit

1 year ago

Adds support for chat formatted finetuning input data. (#884) * fix conflicting formatting linting guidelines * used older union operator for legacy support * did the same thing in another place * isort ignore specific lines * fixes * isort do not skip line * address comments * renamed some more things * split tests and add some verification for tokenization split * fix formatting * added docstrings * added end-to-end-test with HF dataset * fix code style * renamed file and fixed tests * use chat template diff * addressed comment * Update llmfoundry/data/finetuning/tasks.py Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> * Update llmfoundry/data/finetuning/tasks.py Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> * fixed type of TokenizedExample * use cast * use _ALLOWED_{PROMPT, RESPONSE}_KEYS * updated tests * fix * fix? * Update llmfoundry/data/finetuning/tasks.py Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> * Update llmfoundry/data/finetuning/tasks.py Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com> --------- Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

References

#884 - Adds support for chat formatted finetuning input data.

Author

milocress

Parents

26349879

llm-foundry ac783541 - Adds support for chat formatted finetuning input data. (#884)

llm-foundry
ac783541 - Adds support for chat formatted finetuning input data. (#884)