DeepSpeed
02da3732 - ALST/UlyssesSP: more intuitive API wrt variable seqlen (#7656)

Commit
64 days ago
ALST/UlyssesSP: more intuitive API wrt variable seqlen (#7656) As I was integrating ALST/Ulysses SP into HF Accelerate/Trainer I noticed that the initial `UlyssesSPAttentionHF.register_with_transformers` API was a bit inflexible/confusing wrt variable seqlen. This PR deprecates the misleading `max_length` arg name, replaces it with `seq_length` and makes the latter optional if `seq_length_is_variable` is True. Updated tests and docs. Signed-off-by: Stas Bekman <stas@stason.org>
Author
Parents
Loading