[inference] ds-attention refactor w.r.t. ops #2623
refactor of ds-attn to use new op bindings
4c6a2574
Merge branch 'master' into jeffra/op-shim
4436042c
fix imports
dd6088f6
fix for softmax
8c1f1aab
fix typo
e7610c43
jeffra
marked this pull request as ready for review 3 years ago
jeffra
changed the title Inference backend refactor [inference] ds-attention refactor w.r.t. ops 3 years ago
Merge branch 'master' into jeffra/op-shim
3be21698
address comments and consolidate imports
cabf95a7
Merge branch 'jeffra/op-shim' of github.com:microsoft/DeepSpeed into …
f413336c
fix import issue
804e1b94
move bloom specific attn into its own class
7a845df9
cmikeh2
approved these changes
on 2022-12-21
remove dead code
ba1ffafc
Merge branch 'master' into jeffra/op-shim
7eec8848
Merge branch 'master' into jeffra/op-shim
7c0c1dcf
Merge branch 'master' into jeffra/op-shim
f722064e
Merge branch 'master' into jeffra/op-shim
ba0690c2
move softmax op to BloomSelfAttention
3b239335
only load op in base to avoid excessive logging
3b99b579
jeffra
merged
bb68c526
into master 3 years ago
jeffra
deleted the jeffra/op-shim branch 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub