[Feat] add I2VGenXL for image-to-video generation #6665
let's see
bb7c4121
better conditioning for class_embed_type
d537b6c5
determine in_channels programatically.
15f16071
worse condition
a329b73e
fix: sample_size.
5660ba1e
Merge branch 'main' into convert-i2vgen-xl
eb8ea72b
separte script for i2vgen
011329d2
changes
3e0015d0
fix: basic transformer block init.
f09c2dd7
check
7dd0cb02
revert block_out_channels.
d6f1e6d7
debug info
da5b83c0
debug info
0ecef35c
debug info
13ecc11a
debug
6778b3f8
correct ffn inner dim
20aeaf3b
debug info
ef85c84a
input channels should be 8./
7d031624
input channels corrected
a7366940
Revert "input channels corrected"
34e7349b
better input channels
896d626a
Revert "better input channels"
02b76b5e
rectify conversion script
15a6fbde
conversion script.
5a097226
conversio
bcccfdfd
push_to_hub
1c68e056
remove print
3b5940b8
let's see.
aaae0320
safeguard .
1c72370d
device place,ent
25527f86
comment to remind that writing good code is important
c38ef7af
device placement.
4f4d4e6a
corrct layernorm condition.
e717630f
norm3 condition
292668ee
correct norm3
17a2418e
incorporate einops
d5b76930
image_embeddings
35b15f28
okay
693b2cee
dtype debug
642cbe4b
dtype fix
105ecc55
dtype fix.
d7e6b2cd
simplify code.
5de43480
remove print
2852de16
debug
76772c5b
debug
600ffd85
debug
ecf0070c
debug
b88d9a9f
debu
87eff5ed
debug
3178e742
remove print
32f6151d
add: dummy pipeline implementation too.
87e70abc
pipeline draft
5e7f17ff
complete conversion script.
28b9d57f
add new unet to modules
7943c91f
enable chunked decoding on vae.
7f3d5593
correct image latent behaviour
26d87c29
remove comment
5d03574f
correct dtyp
989c707a
correct output type.
7b88ad37
Merge branch 'main' into convert-i2vgen-xl
eec8791c
init fix
6ff96068
fix-copies
b44e0532
Merge branch 'main' into convert-i2vgen-xl
51fdf304
chunked decoding should be optional
9bd5f16c
what happens if we take mode instead?
cc7e9754
fix: type
734274ad
back to sampling and clean up tensorification
b48f0945
better variable name
761c08ec
try to follow the original implementation closely.
a0c00c02
proper repeatation
c6d35e2d
fix: fps condition check
0ebad2e9
fix: masking
da309d5b
fix: masking
5b0b5dfc
go back to negative_image_image_latents.
ef4dd348
make type casting for fps explicit
670488e1
original implementation image_latents.
80a6f1a7
Revert "original implementation image_latents."
85d364c2
sinusoidal embedding?
b0865dd6
simple bilinear resizing.
87742e92
remove the sinusoidal implementation from i2vgenxl
89024408
resolve conflicts
e9cd8397
harmonize with main
585a6b6e
fix: tensor2vid
90d91a8c
fix: tensor2vid
4a7d4aee
fix: tensor2vid
ab9569f8
fix: doc
58844fe1
fix model offload sequence.
11fd6469
update
6778e6b9
update
9f737924
add docs
0ecd79b8
update
eefa6ccb
update
a9fecb33
update
27817919
update
0d1ea8c4
update
1d3846d4
update
db0213a1
improve docs.
f2964ba8
docstring to the pipeline
2d5071e5
licensing in the pipeline scripts.
4cd00836
clean up the docstring of the UNet.
6012362d
Merge branch 'main' into convert-i2vgen-xl
09519a1e
make _resize_bilinear and _center_crop_wide accept torch tensors as w…
23935a97
data type fix
57b20ee1
unint8 > uint8
4d51fe8a
channels_last
14404b2a
debug
24d813e5
fix download path for the example image
bf1eb40f
fix: download path again
f35f3d8a
use cross_attention_dim to initialize
28804420
debug
698f9c11
debu
68cbe593
reduce hidden size of the vision encoder
45c682ec
go
3a701a21
debug more
0a4c6866
reduce more hidden dim
758acc0f
remove callback and callback_steps from required params check
0bfd0427
remove print
dd5a8f04
assertions for the default case..
50d46062
skip test_attention_slicing_forward_pass as it's depcrecated.
2a4c7272
feature_extractor.
66034c52
feature_extractor.
b1819cda
relax precision
48e7694d
relax more.
0f230f9c
torch.manual_seed(0)
836fb678
relax precision
a5cb5b1a
uncomment batching tests
947e63ad
debug
216b9ddc
debug more
7c810525
debug more
31409afd
make the pt to pil utilities better
2faaffed
debug
a29c2011
format string
4adc8519
okay
9cb0b846
force_feature_extractor_resize
0b9a9ef7
debug
7cffe74a
expand to samples's shape
3d0ef8b4
check
ca422ef8
fix: batching behaviour for fps
cfafe51e
test_inference_batch_single_identical
d85bd2d2
relax test_inference_batch_single_identical
73242917
relax a bit more.
bb103025
test_num_videos_per_prompt
5e79f3d1
let's go.
f3c58a24
remove extra prints.
7cb384c8
remove force_feature_extractor_resize
f64c3d25
remove force_feature_extractor_resize
1675b075
fix: test_num_videos_per_prompt
8c20445d
fix: test_num_videos_per_prompt
2221bbc8
fix: test_num_videos_per_prompt
65849883
fix a bit more
7fad5851
style
61091789
add: slow test
7b281b53
flattened image slice
24cfc1d9
variant
edd6cc5e
assertion
6ebb2653
add: note about memory optimization
a03649fa
sayakpaul
marked this pull request as ready for review 2 years ago
being to cpu before calling numpy()
c78617ee
finish slow test fixes
76a42b38
Merge branch 'main' into convert-i2vgen-xl
ca4a977a
Empty-Commit
182447af
pin peft dependencies.
a54facc0
remove attention slicing and unload_lora
72e466e2
remove attention mask
5ba9b4ad
timsteps.
693d8270
add missing entries in the unet docstring
88f03a39
Apply suggestions from code review
9bf706e4
remove textual inversion
a682dca0
remove _to_tensor on fps.
a9c23e8b
leverage VaeImageProcessor.
ec5694ad
remove unnecessary config vars.
e6c07b56
use num_attention_heads
8c980df1
clean up conv_out layer creation
5cbaf2e7
refactor attention logic for cleaning up norm handling
be518c87
Apply suggestions from code review
b8fde102
simplify norm_type checks in the forwards.
a0e6db10
add copied from statement where missing
c6c0b310
move _center_crop and _resize_bilinear out of the encode image function
5038c214
Merge branch 'main' into convert-i2vgen-xl
6ac5ec59
yiyixuxu
approved these changes
on 2024-01-31
update
6d7fb879
Merge branch 'main' into convert-i2vgen-xl
2c1caeaa
clean up
513ab1f6
update
13fcc20b
update
7b7f0751
change checkpoints.
fe50995d
yiyixuxu
merged
04cd6adf
into main 2 years ago
yiyixuxu
deleted the convert-i2vgen-xl branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub