PR #9082 Add CogVideoX text-to-video generation model

Create autoencoder_kl3d.py

c8e5491b

vae draft

c341786f

initial draft of cogvideo transformer

bd6efd5f

add imports

bb917755

fix attention mask

59e6669f

fix layernorms

45cb1f92

fix with some review guide

84ff56eb

rename

a3d827fb

fix error

dc7e6e81

Update autoencoder_kl3d.py

aff72ec5

fix nasty bug in 3d sincos pos embeds

cb5348a0

refactor

e9828817

update conversion script for latest modeling changes

d963b1aa

remove debug prints

16967589

make style

21a0fc1b

add workflow to rebase with upstream main nightly.

d83c1f84

add upstream

dfeb3297

Revert "add workflow to rebase with upstream main nightly."

71bcb1e1

add workflow for rebasing with upstream automatically.

0980f4dc

follow review guide

ee40f0e1

add

8fe54bcd

remove deriving and using nn.module

1c661ce3

Merge branch 'cogvideox' into cogvideox-common-draft-1

73b041e7

add skeleton for pipeline

b3052807

make fix-copies

6bcafcba

Merge branch 'main' into cogvideox-common-draft-2

ec9508c8

undo unnecessary changes added on cogvideo-vae by mistake

3ae94139

groups->norm_num_groups

2be74698

verify CogVideoXSpatialNorm3D implementation

9f9d0cbb

minor factor and repositioning of code in order of invocation

c43a8f5b

reorder upsampling/downsampling blocks in order of invocation

5f183bfe

minor refactor

470815ce

implement encode prompt

e67cc5ae

make style

d45d199b

make fix-copies

73469f95

fix bug in handling long prompts

45f7127a

update conversion script

a449ceb3

add doc draft

4498cfc9

Merge branch 'cogvideox-common-draft-2' of https://github.com/hugging…

2956866e

add clear_fake_cp_cache

bb4740ce

refactor vae

e05f8347

modeling fixes

03c28eef

make style

712ddbea

add pipeline implementation

03ee7cd1

using with 226 instead of 225 of final weight

a31db5f9

remove 0.transformer_blocks.encoder.embed_tokens.weight

351d1f00

update

d0b8db2b

ensure tokenizer config correctly uses 226 as text length

fe6f5d64

add cogvideo specific attn processor

4c2e8870

remove debug prints

41da084f

add pipeline docs

77558f31

make style

e12458e1

remove incorrect copied from

c33dd021

vae problem fix

71e7c82a

schedule

ec53a30a

remove debug prints

551c884a

update

3def9052

Merge pull request #4 from huggingface/cogvideox-refactor-to-diffusers

65f6211f

fp16 problem

21509aa7

fix some comment

b42b0792

fix

477e12b2

timestep fix

fd0831c5

Restore the timesteps parameter

d99528be

Update downsampling.py

c7ee165c

remove chunked ff code; reuse and refactor to support temb directly i…

61c6da07

make inference 2-3x faster (by fixing the bug i introduced) 🚀😎

fa7fa9cc

new schedule with dpm

6988cc3a

remove attenstion mask

ba4223ac

apply suggestions from review

312f7dc4

make style

1b1b26b6

add workflow to rebase with upstream main nightly.

ba1855c0

add upstream

7360ea1d

Revert "add workflow to rebase with upstream main nightly."

2f1b7870

add workflow for rebasing with upstream automatically.

90aa8be5

Merge branch 'huggingface:main' into main

5781e017

make fix-copies

92c8c007

Merge branch 'main' into cogvideox-common-draft-2

fd11c0fb

remove cogvideox-specific attention processor

03580c07

update docs

01c2dff3

update docs

311845fc

cogvideox branch

1b1b737a

add CogVideoX team, Tsinghua University & ZhipuAI

2d9602cc

Merge branch 'cogvideox-common-draft-2' of github.com:huggingface/dif…

fb6130fe

merge remote branch

511c9ef5

zRzRzRzRzRzRzR changed the title ~~Cogvideox 2b~~ Add CogVideoX text-to-video generation model 1 year ago

a-r-r-o-w commented on 2024-08-05

Merge branch 'main' into cogvideox-2b

123ecef2

DN6 commented on 2024-08-05

DN6 commented on 2024-08-05

DN6 commented on 2024-08-05

DN6 commented on 2024-08-05

DN6 commented on 2024-08-05

fix some error

cf7369d4

DN6 commented on 2024-08-05

rename unsample and add some docs

9c6b8894

messages

22dcceb8

update

e4d65ccd

Merge branch 'cogvideox-2b' of github.com:zRzRzRzRzRzRzR/diffusers in…

6f4e60b5

use num_frames instead of num_seconds

70a54a82

Merge branch 'main' into cogvideox-2b

b3428ad5

a-r-r-o-w commented on 2024-08-05

restore

9a0b9065

Update lora_conversion_utils.py

32da2e76

remove dynamic guidance scale

878f609a

yiyixuxu commented on 2024-08-05

yiyixuxu commented on 2024-08-06

sayakpaul commented on 2024-08-06

address review comments

de9e0b2f

dynamic cfg; fix cfg support

9c086f5a

address review comments

62d94aaa

update tests

5e4dd151

Merge branch 'main' into cogvideox-2b

884ddd09

fix docs error

d1c575ad

sayakpaul commented on 2024-08-06

alternative implementation to context parallel cache

11224d95

a-r-r-o-w commented on 2024-08-06

yiyixuxu approved these changes on 2024-08-06

Update docs/source/en/api/pipelines/cogvideox.md

70cea915

stevhliu approved these changes on 2024-08-06

remove tiling and slicing until their implementations are complete

cbc4d32d

yiyixuxu commented on 2024-08-06

Merge branch 'main' into cogvideox-2b

14698d04

Merge branch 'main' into cogvideox-2b

8be845d3

Apply suggestions from code review

827a70ae

sayakpaul commented on 2024-08-07

yiyixuxu merged 2dad462d into main 1 year ago

hkunzhe commented on 2024-10-18

zRzRzRzRzRzRzR deleted the cogvideox-2b branch 1 year ago

diffusers
Add CogVideoX text-to-video generation model
#9082

Merged

Add CogVideoX text-to-video generation model #9082

diffusers Add CogVideoX text-to-video generation model #9082 Merged

Add CogVideoX text-to-video generation model #9082

diffusers
Add CogVideoX text-to-video generation model
#9082

Merged