vllm
e59ca942
- Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Comment Changes
Previous Change (CTRL+↑)
Next Change (CTRL+↓)
Expand Context Lines
Collapse Context Lines
Hide Minimap (CTRL+M)
Commit
102 days ago
Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932) Signed-off-by: Bill Nell <bnell@redhat.com>
References
#13932 - Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations.
Author
bnellnm
Parents
a57a3044
Files
6
benchmarks/kernels
benchmark_moe.py
tests/kernels
test_block_fp8.py
vllm
_custom_ops.py
envs.py
model_executor/layers
fused_moe
fused_moe.py
quantization
fp8.py
Loading