transformers
181beb3b - 🚨 Modeling changes for export, compile, and hybrid-attention standardization (#46738)

Commit

3 days ago

🚨 Modeling changes for export, compile, and hybrid-attention standardization (#46738) * model fixes for export compatibility * add minicpmv vision utils * can compile * make ssms fully compilable * more ssms * address new comments * fast ci fixes * fix minimax * fix the last falcon fast ci failure * cleanup * use can_return_tuple in patchmixer * avoid 5D attention * granite export fix * address modeling comments * update * address more comments, unify linear attention and make more models support full graph compile * style * style * modeling changes for attn impl compatibility * fix * fixes * fix lfm2 conv errors * address most of anton's review comments * address cyril's comments * revert conv rename * simpler modular * keep max_batch_size check * fixes * fix moshi cuda graphs * added layer type class attr in mixers * fix repo * address comments * address comments * Merge branch 'main' into hf-exporters-models * fix merge * claude review findings * docs comments * address comments * style * claude review * small fix caught by copilot

References

#46738 - 🚨 Modeling changes for export, compile, and hybrid-attention standardization

Author

IlyasMoutawwakil

Parents

45b004d7

transformers 181beb3b - 🚨 Modeling changes for export, compile, and hybrid-attention standardization (#46738)

transformers
181beb3b - 🚨 Modeling changes for export, compile, and hybrid-attention standardization (#46738)