vllm
3e887d2e - permute/unpermute kernel for moe optimization (#14568)

Commit
17 days ago
permute/unpermute kernel for moe optimization (#14568) Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
Author
Parents
  • File
    CMakeLists.txt
  • benchmarks/kernels
    • File
      benchmark_grouped_gemm_cutlass.py
    • File
      benchmark_moe.py
    • File
      benchmark_moe_permute_unpermute.py
  • csrc/moe
    • File
      moe_permute_unpermute_op.cu
    • permute_unpermute_kernels
      • File
        dispatch.h
      • File
        moe_permute_unpermute_kernel.cu
      • File
        moe_permute_unpermute_kernel.h
      • File
        moe_permute_unpermute_kernel.inl
    • File
      torch_bindings.cpp
  • tests/kernels
    • moe
      • File
        test_moe.py
      • File
        test_moe_permute_unpermute.py
    • quantization
      • File
        test_awq_marlin.py
      • File
        test_block_fp8.py
  • vllm/model_executor
    • layers/fused_moe
      • File
        fused_marlin_moe.py
      • File
        fused_moe.py
      • File
        layer.py
      • File
        moe_permute_unpermute.py
    • models
      • File
        arctic.py