vllm
[Model] add optimal triton fused moe configs for NemotronH MoE
#27967
Merged

[Model] add optimal triton fused moe configs for NemotronH MoE #27967

tomeras91
tomeras91 Add NemotronHForCausalLM arch to benchmark_moe.py
ffa56141
tomeras91 Add triton moe configs for nemotronH for TP=1,2 / H100 / L40S (BF16)
17472d3a
tomeras91 tomeras91 requested a review from mgoin mgoin 202 days ago
tomeras91 tomeras91 requested a review from pavanimajety pavanimajety 202 days ago
mergify mergify added performance
gemini-code-assist
gemini-code-assist commented on 2025-11-03
heheda12345
heheda12345 approved these changes on 2025-11-04
heheda12345 heheda12345 changed the title [Model] app optimal triton fused moe configs for NemotronH MoE [Model] add optimal triton fused moe configs for NemotronH MoE 201 days ago
heheda12345 heheda12345 enabled auto-merge (squash) 201 days ago
github-actions github-actions added ready
tomeras91 Merge branch 'main' into add-nemotronH-moe-configs
7aaa7af9
heheda12345 heheda12345 merged e4ee6586 into main 201 days ago
tomeras91 tomeras91 deleted the add-nemotronH-moe-configs branch 201 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone