transformers
use the enable_gqa param in torch.nn.functional.scaled_dot_product_at…
#39412
Merged

Loading