transformers
4520b549 - Optimize MoEs for decoding using batched_mm (#43126)

Commit
22 days ago
Optimize MoEs for decoding using batched_mm (#43126) * optimize model for decoding * only optimize when grouped_mm * fixes * fix training compile failures * no need to skip * style * fix * Apply suggestion from @IlyasMoutawwakil * Apply suggestion from @IlyasMoutawwakil * info once
Parents
Loading