DeepSpeed
Optimize Softmax Kernel
#3112
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
18
Changes
View On
GitHub
Commits
Simplify kernel
molly-smith
committed
2 years ago
Coalesce memory attempt 1. Logits divergence.
molly-smith
committed
2 years ago
Logits fix?
molly-smith
committed
2 years ago
sync after every global mem access
molly-smith
committed
2 years ago
template on iterations. Down to 8.3% cuda time for 8k tokens
molly-smith
committed
2 years ago
Up to 64 iterations
molly-smith
committed
2 years ago
Add alibi/mask check
molly-smith
committed
2 years ago
fp32
molly-smith
committed
2 years ago
Revert builder.py
molly-smith
committed
2 years ago
naming. precommit
molly-smith
committed
2 years ago
Revert "naming. precommit"
molly-smith
committed
2 years ago
naming. spacing
molly-smith
committed
2 years ago
Spacing. simplify checks
molly-smith
committed
2 years ago
remove bsyncs
molly-smith
committed
2 years ago
missed bsyncs
molly-smith
committed
2 years ago
Merge branch 'master' into mosm/softmax
molly-smith
committed
2 years ago
precommit
molly-smith
committed
2 years ago
Merge branch 'master' into mosm/softmax
molly-smith
committed
2 years ago
Loading