Granite speech speedups (#39197)

Commit

210 days ago

Granite speech speedups (#39197) * ensure the query is updated during training avoid unused parameters that DDP does not like * avoid a crash when `kwargs` contain `padding=True` trainers often pass this argument automatically * minor * Remove mel_spec lazy init, and rename to mel_filters. this ensures save_pretrained will not crash when saving the processor during training https://github.com/huggingface/transformers/blob/d5d007a1a0f0c11a726a54c8f00bd71825f84d02/src/transformers/feature_extraction_utils.py#L595 * minor - most feature extractors has a `sampling_rate` property * speedup relative position embeddings * fix several issues in model saving/loading: - avoid modifying `self._hf_peft_config_loaded` when saving - adapter_config automatically points to the original base model - a finetuned version should point to the model save dir. - fixing model weights names, that are changed by adding an adapter. * minor * minor * minor * fixing a crash without peft active * add todo to replace einsum * granite speech speedups: 1. register attention_dist to avoid cpu-to-gpu transfer every layer. 2. pad_sequence is much faster than per-sample-padding + concat. 3. avoid returning audio back to cpu when using a compute device. * support audio.shape=(1,L)

References

#39197 - Granite speech speedups

Author

avihu111

Parents

5111c8ea

transformers 2d600a43 - Granite speech speedups (#39197)

transformers
2d600a43 - Granite speech speedups (#39197)