diffusers
Add AudioLDM 2
#4549
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
60
Changes
View On
GitHub
Add AudioLDM 2
#4549
sanchit-gandhi
merged 60 commits into
huggingface:main
from audioldm-v2
from audioldm
364bf817
unet down + mid
c214f157
vae, clap, flan-t5
a72144cc
start sequence audio mae
251158ea
iterate on audioldm encoder
ea48b182
finish encoder
e0fe8192
finish weight conversion
c259ee5f
text pre-processing
06036752
gpt2 pre-processing
807b8d4f
fix projection model
35860e65
working
9dc3d417
unet equivalence
d01e73b4
finish in base
5358db5e
add unet cond
c8995f03
finish unet
fc67871f
finish custom unet
f3bf3002
start clean-up
9f10d1fb
revert base unet changes
74459f46
refactor pre-processing
bcf13ad9
tests: from audioldm
220c391f
fix some tests
872e18e4
more fixes
227211f2
iterate on tests
167c309a
sanchit-gandhi
requested a review
from
williamberman
2 years ago
sanchit-gandhi
requested a review
from
sayakpaul
2 years ago
sanchit-gandhi
commented on 2023-08-16
make fix copies
60d3a011
sayakpaul
commented on 2023-08-17
harden fast tests
7c2afd04
slow integration tests
bd5ca3e2
finish tests
3a88695e
update checkpoint
c1ca58da
update copyright
fc96707d
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
docs
312b83be
remove outdated method
8229558e
sayakpaul
commented on 2023-08-17
add docstring
f884bd5a
make style
96557950
remove decode latents
8233cf73
enable cpu offload
8a167bf1
(text_encoder_1, tokenizer_1) -> (text_encoder, tokenizer)
e36cb521
more clean up
d42dca62
more refactor
d5802143
build pr docs
68206a5f
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
commented on 2023-08-17
sayakpaul
approved these changes on 2023-08-17
Update docs/source/en/api/pipelines/audioldm2.md
880b9853
small clean
3c597438
williamberman
commented on 2023-08-17
williamberman
commented on 2023-08-17
williamberman
commented on 2023-08-17
williamberman
approved these changes on 2023-08-17
tidy conversion
22ea9fad
update for large checkpoint
cf522ff0
generate -> generate_language_model
925318db
full clap model
9e1895ca
shrink clap-audio in tests
d23d0bee
fix large integration test
5ace2088
fix fast tests
7cfe24f4
use generation config
7274f388
make style
40ebb18d
update docs
0e166449
finish docs
d03ed7bf
finish doc
ee295331
update tests
c3628696
fix last test
9d754318
syntax
69346edc
finalise tests
14de2964
refactor projection model in prep for TTS
e9328fcd
fix fast tests
94873eaa
style
9302cb02
sanchit-gandhi
merged
7a24977c
into main
2 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
sayakpaul
williamberman
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub