Megatron-DeepSpeed
Fix causal attention mask
#306
Merged

Fix causal attention mask #306

thomasw21 merged 8 commits into main from thomas/fix_causal_attention_mask
thomasw21
thomasw21 Raname attn_mask as it's bad IMO
dc9709b7
thomasw21 WIP: make sure that custom kernels have the same API as their torch v…
fbabf8ca
thomasw21 Add a test that we are in fact using a custom kernel
3a857ddb
thomasw21 Woops wrong import
80613763
thomasw21 Woops
946ded09
thomasw21 Fix device issue + relax test to assert close instead of equal
038bddcb
thomasw21 Woops fix the causal mask
96823a83
thomasw21 thomasw21 marked this pull request as ready for review 3 years ago
thomasw21 thomasw21 requested a review from stas00 stas00 3 years ago
thomasw21 thomasw21 requested a review from Muennighoff Muennighoff 3 years ago
thomasw21
thomasw21 commented on 2022-07-07
stas00
stas00 commented on 2022-07-07
thomasw21 Update test + use getattr
c20dd196
thomasw21 thomasw21 requested a review from stas00 stas00 3 years ago
Muennighoff
Muennighoff approved these changes on 2022-07-07
stas00
stas00 approved these changes on 2022-07-07
thomasw21 thomasw21 merged 38607ae9 into main 3 years ago
thomasw21 thomasw21 deleted the thomas/fix_causal_attention_mask branch 3 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone