CogVideoX-5b-I2V support (#9418)
* draft Init
* draft
* vae encode image
* make style
* image latents preparation
* remove image encoder from conversion script
* fix minor bugs
* make pipeline work
* make style
* remove debug prints
* fix imports
* update example
* make fix-copies
* add fast tests
* fix import
* update vae
* update docs
* update image link
* apply suggestions from review
* apply suggestions from review
* add slow test
* make use of learned positional embeddings
* apply suggestions from review
* doc change
* Update convert_cogvideox_to_diffusers.py
* make style
* final changes
* make style
* fix tests
---------
Co-authored-by: Aryan <aryan@huggingface.co>