DeepSpeed-Triton for Inference #3748
[squash] styoun/triton fp16 transformer (#530)
6d291dbc
Triton kernels and BERT inference using triton in float16 (#459)
b978eae4
readme for blog
f41e279c
typo in readme
2d499bf1
readme
42b8de41
readme
d16aaa20
readme
b4fee896
readme
468d6983
plots
d5fff4fe
readme
643a2fcc
readme
e6d44c98
readme
fdb87060
readme
b29157b9
readme
8078e04d
typo in readme
f73bd90a
readme revision after the feedbacks
c2bd7dd1
typo
19d4231c
refined the writing in readme
314c5eef
readme
4da70a41
readme
bad841c5
removed obsolete comments from matmul_ext.py
47928a21
typo
db16fa0d
Merge branch 'master' into staging-triton-bert-v1
39cbe478
stephen-youn
changed the title [squash] styoun/triton fp16 transformer (#530) DeepSpeed-Triton for Inference 2 years ago
jeffra
commented
on 2023-06-22
jeffra
commented
on 2023-06-22
jeffra
approved these changes
on 2023-06-22
cmikeh2
approved these changes
on 2023-06-22
readme change from pr comments
219a1b8c
Merge branch 'staging-triton-bert-v1' of https://github.com/microsoft…
072fe374
Merge branch 'master' into staging-triton-bert-v1
7f7b76d2
removed obsolete codes and comments
223ad1fc
Merge branch 'staging-triton-bert-v1' of https://github.com/microsoft…
afc34fa5
awan-10
approved these changes
on 2023-06-22
readme
1eadebdf
Merge branch 'master' into staging-triton-bert-v1
4cb8b371
Merge branch 'master' into staging-triton-bert-v1
2835e0da
jeffra
merged
4dc65f7b
into master 2 years ago
jeffra
deleted the staging-triton-bert-v1 branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub