Stable Audio integration #8716
WIP modeling code and pipeline
6151db56
add custom attention processor + custom activation + add to init
656561b7
correct ProjectionModel forward
819d7468
add stable audio to __initèè
8a1a9d88
add autoencoder and update pipeline and modeling code
960339dc
add half Rope
51c838f4
add partial rotary v2
87f1e261
add temporary modfis to scheduler
2f2bb8a0
add EDM DPM Solver
dc3f0eb1
remove TODOs
07fc3c37
clean GLU
b49a3d5f
remove att.group_norm to attn processor
d1b3e207
revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
23be1a3a
refactor GLU -> SwiGLU
9d324088
Merge branch 'main' into add-stable-audio
661d4f19
remove redundant args
3689af07
add channel multiples in autoencoder docstrings
282e4788
changes in docsrtings and copyright headers
c9fef252
clean pipeline
e51ffb20
further cleaning
ab6824c6
remove peft and lora and fromoriginalmodel
eeb19fee
Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace
a43dfc51
make style
e7185e56
dummy models
3c6715e3
fix copied from
14fa2bf6
add fast oobleck tests
21d0171b
add brownian tree
9cc7c02b
oobleck autoencoder slow tests
c5eeafef
remove TODO
0a2d065a
fast stable audio pipeline tests
29e794b9
add slow tests
1bad2878
make style
cf15409a
add first version of docs
dec61b31
wrap is_torchsde_available to the scheduler
1961cc9e
fix slow test
3c7df741
test with input waveform
92392fda
add input waveform
d826f0fd
remove some todos
94c2a25a
create stableaudio gaussian projection + make style
ad8660e3
add pipeline to toctree
55b2a148
fix copied from
42a05c58
ylacombe
changed the title [WIP] Stable Audio integration Stable Audio integration 1 year ago
Merge branch 'huggingface:main' into add-stable-audio
8919ba03
make quality
2df8e416
refactor timestep_features->time_proj
68a5b56a
refactor joint_attention_kwargs->cross_attention_kwargs
a81f46d7
remove forward_chunk
8e910d34
move StableAudioDitModel to transformers folder
406f02a1
correct convert + remove partial rotary embed
3a1dddba
apply suggestions from yiyixuxu -> removing attn.kv_heads
c44d0a43
remove temb
e5859f1c
remove cross_attention_kwargs
d35451df
further removal of cross_attention_kwargs
76debd5b
remove text encoder autocast to fp16
acde6d52
continue removing autocast
566972d6
make style
f187d65a
Merge branch 'huggingface:main' into add-stable-audio
af4f2ab8
refactor how text and audio are embedded
8aa2e11e
add paper
58ca32c5
update example code
a4b69307
make style
c0873dc9
unify projection model forward + fix device placement
bc369337
make style
f318e15f
remove fuse qkv
8382156c
Merge branch 'huggingface:main' into add-stable-audio
6ff9cf6a
apply suggestions from review
f91b0849
Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py
29dc552c
make style
ff620351
smaller models in fast tests
d61a1a9e
pass sequential offloading fast tests
f1c95853
add docs for vae and autoencoder
88933735
Merge branch 'main' into add-stable-audio
0b938042
make style and update example
264dd6df
yiyixuxu
approved these changes
on 2024-07-29
remove useless import
0277c7fa
add cosine scheduler
1565d8ae
dummy classes
d820e688
cosine scheduler docs
fea9f8e2
Merge branch 'main' into add-stable-audio
8abdb61f
better description of scheduler
81dedd91
Merge branch 'huggingface:main' into add-stable-audio
6d5d663c
sayakpaul
merged
69e72b1d
into main 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub