Add Support for LTX-2.3 Models (#13217)
* Initial implementation of perturbed attn processor for LTX 2.3
* Update DiT block for LTX 2.3 + add self_attention_mask
* Add flag to control using perturbed attn processor for now
* Add support for new video upsampling blocks used by LTX-2.3
* Support LTX-2.3 Big-VGAN V2-style vocoder
* Initial implementation of LTX-2.3 vocoder with bandwidth extender
* Initial support for LTX-2.3 per-modality feature extractor
* Refactor so that text connectors own all text encoder hidden_states normalization logic
* Fix some bugs for inference
* Fix LTX-2.X DiT block forward pass
* Support prompt timestep embeds and prompt cross attn modulation
* Add LTX-2.3 configs to conversion script
* Support converting LTX-2.3 DiT checkpoints
* Support converting LTX-2.3 Video VAE checkpoints
* Support converting LTX-2.3 Vocoder with bandwidth extender
* Support converting LTX-2.3 text connectors
* Don't convert any upsamplers for now
* Support self attention mask for LTX2Pipeline
* Fix some inference bugs
* Support self attn mask and sigmas for LTX-2.3 I2V, Cond pipelines
* Support STG and modality isolation guidance for LTX-2.3
* make style and make quality
* Make audio guidance values default to video values by default
* Update to LTX-2.3 style guidance rescaling
* Support cross timesteps for LTX-2.3 cross attention modulation
* Fix RMS norm bug for LTX-2.3 text connectors
* Perform guidance rescale in sample (x0) space following original code
* Support LTX-2.3 Latent Spatial Upsampler model
* Support LTX-2.3 distilled LoRA
* Support LTX-2.3 Distilled checkpoint
* Support LTX-2.3 prompt enhancement
* Make LTX-2.X processor non-required so that tests pass
* Fix test_components_function tests for LTX2 T2V and I2V
* Fix LTX-2.3 Video VAE configuration bug causing pixel jitter
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Refactor LTX-2.X Video VAE upsampler block init logic
* Refactor LTX-2.X guidance rescaling to use rescale_noise_cfg
* Use generator initial seed to control prompt enhancement if available
* Remove self attention mask logic as it is not used in any current pipelines
* Commit fixes suggested by claude code (guidance in sample (x0) space, denormalize after timestep conditioning)
* Use constant shift following original code
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>