transformers
49aff18f
- Actually i think the attention casting only makes sense when we use torch.float16
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
Actually i think the attention casting only makes sense when we use torch.float16
References
#18344 - [BLOOM] Clean modeling code
Author
thomasw21
Parents
5fcc118a
Loading