diffusers
5dde9fc1 - [LTX-2.3] Fix bf16 parity between diffusers and reference implementation

Commit
10 days ago
[LTX-2.3] Fix bf16 parity between diffusers and reference implementation Seven fixes to achieve bit-identical output between the diffusers LTX-2.3 pipeline and the reference Lightricks/LTX-2 implementation in bf16/GPU: 1. encode_video: use truncation (.astype) instead of .round() for float→uint8, matching the reference's .to(torch.uint8) behavior 2. Scheduler sigma computation: compute time_shift and stretch_shift_to_terminal in torch float32 instead of numpy float64 to match reference precision 3. Initial sigmas: use torch.linspace (float32) instead of np.linspace (float64) to produce bit-identical sigma schedules 4. CFG formula: use reference formula cond + (scale-1)*(cond-uncond) instead of uncond + scale*(cond-uncond) to match bf16 arithmetic order 5. Euler step: upcast model_output to sample dtype before multiplying by dt, avoiding bf16 precision loss from 0-dim tensor type promotion rules 6. x0→velocity division: use sigma.item() (Python float) instead of 0-dim tensor, matching reference's to_velocity which uses sigma.item() internally 7. RoPE: remove float32 upcast in apply_interleaved_rotary_emb and apply_split_rotary_emb, cast cos/sin to input dtype instead — reference computes RoPE in model dtype (bf16) without upcasting Also updates RMSNorm to use torch.nn.functional.rms_norm for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
yiyi@huggingface.co
Parents
Loading