vllm
3e887d2e
- permute/unpermute kernel for moe optimization (#14568)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
17 days ago
permute/unpermute kernel for moe optimization (#14568) Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
References
#14568 - permute/unpermute kernel for moe optimization
Author
CalebDu
Parents
0f87d8f7
Files
19
CMakeLists.txt
benchmarks/kernels
benchmark_grouped_gemm_cutlass.py
benchmark_moe.py
benchmark_moe_permute_unpermute.py
csrc/moe
moe_permute_unpermute_op.cu
permute_unpermute_kernels
dispatch.h
moe_permute_unpermute_kernel.cu
moe_permute_unpermute_kernel.h
moe_permute_unpermute_kernel.inl
torch_bindings.cpp
tests/kernels
moe
test_moe.py
test_moe_permute_unpermute.py
quantization
test_awq_marlin.py
test_block_fp8.py
vllm/model_executor
layers/fused_moe
fused_marlin_moe.py
fused_moe.py
layer.py
moe_permute_unpermute.py
models
arctic.py
Loading