transformers
ae6b6963 - Allow use of pre-computed lengths when grouping by length. (#10953)

Commit

4 years ago

Allow use of pre-computed lengths when grouping by length. (#10953) A new argument `length_column_name` has been added to `TrainingArguments`, with default value `"length"`. If this column exists and `group_by_length` is `True`, the train sampler will use it for grouping rather than computing it before training starts. This is an optimization that allows the user to prepare data for fast processing, preventing sequential access to the dataset as described in issue #10909.

References

#10953 - Use pre-computed lengths, if available, when grouping by length

Author

pcuenca

Parents

4002f95e

transformers ae6b6963 - Allow use of pre-computed lengths when grouping by length. (#10953)

transformers
ae6b6963 - Allow use of pre-computed lengths when grouping by length. (#10953)