diffusers
a487e2f1 - Add PRXPixelPipeline: pixel-space PRX text-to-image pipeline (#13928)

Commit
1 day ago
Add PRXPixelPipeline: pixel-space PRX text-to-image pipeline (#13928) Adds a pixel-space variant of PRX that denoises raw RGB directly (no VAE), conditioned on a Qwen3-VL text encoder: - PRXTransformer2DModel: new optional config args `bottleneck_size` (two-layer img_in projection for large patch dims) and `resolution_embeds` (PRXResolutionEmbedder conditions the timestep modulation on the latent resolution) - PRXPipeline: support for subclass-tuned tokenizer max length, light text cleaning, x-prediction flow matching (x0 -> velocity conversion), and non-unit initial noise scale - PRXPixelPipeline: thin subclass wiring the above together (vae optional/None, vae_scale_factor=1, 1024px default) - conversion script support for the pixel checkpoint format - registration in __init__ files + dummy objects, docs autodoc entry, fast pipeline tests Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Author
Parents
Loading