[Bug] qwen2_5_omni: cap generation length to be less than the max_position_embedding in DiT (#43068)
* qwen2_5_omni: make max_mel_frames an inference-time knob
* not fail with raising ValueError, instead make it continue to run by choosing a target_duration that's capped and aligned
* added unit tests for Token2WavShape shape mismatch
Signed-off-by: Dong Wang <dongw2019@gmail.com>
* make fixup
* remove unit test which takes too much GPU memory
Signed-off-by: Dong Wang <dongw2019@gmail.com>
* reduce gpu memory usage from the unit test
* addressed comments
Signed-off-by: Dong Wang <dongw2019@gmail.com>
---------
Signed-off-by: Dong Wang <dongw2019@gmail.com>