vllm
e59ca942 - Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932)

Comment changes are shownComment changes are hidden
Commit
102 days ago
Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932) Signed-off-by: Bill Nell <bnell@redhat.com>
Author
Parents
  • benchmarks/kernels
    • File
      benchmark_moe.py
  • tests/kernels
    • File
      test_block_fp8.py
  • vllm
    • File
      _custom_ops.py
    • File
      envs.py
    • model_executor/layers
      • fused_moe
        • File
          fused_moe.py
      • quantization
        • File
          fp8.py