[modular] wan! (#12611)
* update, remove intermediaate_inputs
* support image2video
* revert dynamic steps to simplify
* refactor vae encoder block
* support flf2video!
* add support for wan2.2 14B
* style
* Apply suggestions from code review
* input dynamic step -> additiional input step
* up
* fix init
* update dtype