Stable Diffusion CUDA Optimizations #14428
Add benchmark
6d8402ec
Add GroupNorm fusion
0ac952e9
Merge branch 'main' into tlwu/optimize_sd
c1de9fac
[CUDA] Add GroupNormalization operator
29c47dc3
tianleiwu
marked this pull request as draft 2 years ago
Add Cast for fp16 group_norm
5f40aa8c
Add SplitGelu fusion
a4c4302d
support float type in GroupNorm
f722c5aa
Add SplitGelu operator
4a7bf0d8
format
98b90ca5
format
ea69aec9
misc
9eacd843
update group norm test data to NHWC
c566679a
Fuse Bias and SplitGelu
a9ebeec3
update bias split gelu
53a539f1
update GroupNorm doc
a0c4957b
tianleiwu
force pushed
from
66c8992e
to
d5b9e4d7
2 years ago
packed kv in cross attention
82383dcb
tianleiwu
force pushed
from
d5b9e4d7
to
82383dcb
2 years ago
fix pyright warnings
966b3e72
Add unit test of bias split gelu
4a8583e9
fix typo
982663a0
fix code scanning warnings
73045bbe
fix code scanning warnings
86d57950
address review feedback
efa6d4f4
Add NhwcConv
7a75ce18
fix training api build error
f4d41033
Add float16 test
55a74680
fix type warning
3ff1fe67
update op doc; exclude from hipify
b3a4c014
tianleiwu
marked this pull request as ready for review 2 years ago
tianleiwu
changed the title [WIP] Stable Diffusion CUDA Optimizations Stable Diffusion CUDA Optimizations 2 years ago
wangyems
dismissed these changes
on 2023-02-02
add input checks; clean debug code
1fe78af9
tianleiwu
dismissed their stale review
via 1fe78af9
2 years ago
yufenglee
approved these changes
on 2023-02-03
wangyems
approved these changes
on 2023-02-03
tianleiwu
merged
a6c5ba01
into main 2 years ago
tianleiwu
deleted the tlwu/optimize_sd branch 2 years ago
faxu
added triage:approved
faxu
removed release:1.14
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub