Transformer kernel release #242
Transformer kernels (#49)
06076c63
update DSE
3b1ef351
remove warning note about 44min tutorial coming soon
c6311b21
add the transformer tutorial (#50)
c2ca8073
bump version number to 0.2.0
385acb33
only all-reduce grads if dp world size > 1
99c8ed49
Merge branch 'jeffra/staging' of github.com:microsoft/DeepSpeed-inter…
b8c6c198
revert previous commit, an issue with zero-2 with this change
d7fa8e1c
add transformer kernel API in website (#51)
05b9749c
Bert Tutorial update (#52)
4291d969
update DSE
6f043ef9
add master addr/port to local launching
9d79d817
minor cleanup of transformer tutorial
883de502
add intial version of bert deep dive post
c5d80e18
update img paths for staging
a16088a8
center table
c2cbb986
update tput images and table
f1951136
space between figures
5af86b24
center table
be8f176d
update image
fa242988
un-center table
8f4f96b2
references
41bbe209
add softmax animation
c81ffb98
add laynorm gif
a0a4baad
update gifs
a73b3610
add a space between gifs
6c9dbdc2
update image paths
434dc450
add stochastic_mode in API doc (#53)
f3b14e54
update images
fea90fe2
tmp img path for staging
02e3048f
update the tutorials for fine-tuning (#54)
bf5182e1
update images
195a7acb
Merge branch 'jeffra/staging' of github.com:microsoft/DeepSpeed-inter…
9668591d
fix img path for live
4e7ca29f
Merge branch 'master' into kernel-staging
af560da8
jeffra
merged
734d8991
into master 5 years ago
jeffra
deleted the kernel-staging branch 5 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub