transformers
[CP] Add attention_mask to the buffer when the mask is causal
#40619
Merged

Loading