Add LoRA support for Cosmos Predict 2.5 and fix pipeline to match official Cosmos repo (#13664)
* support lora for cosmos 2.5
* Fix inconsistencies with cosmos official repo in VAE encoding, text encoder attention implementation, and timestep scaling
* Support f_min and f_max in linear_scheduler warmup
* Add requirements and dataset preprocessing scripts to run examples
* Add LoRA training scripts
* Add LoRA eval scripts
* add assets for blogpost
* Fix(scheduler): device mismatch from upstream b114620 - move rk and b to device before torch.stack
* Always upcast to fp32
* Directly inhrit from LoraBaseMixin
* remove flash-attn2
* Use _keep_in_fp32_modules instead of autocast
* remove the get_latent_shape_cthw method and fix style
* simplifiy the eval script to make it more user-friendly
* overwrite scheduling_unipc_multistep.py with main's version
* remove network_alphas and add # Copied from
* remove figures and assets
* revert scheduler
* revert fp32 upcast and support bs > 1
---------
Co-authored-by: Ting-Yun Chang <tingyunc@nvidia.com>