vllm
ec8c1cf7 - squashed commits

Commit
195 days ago
squashed commits Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: simon-mo <simon.mo@hey.com> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Committer
Parents
  • csrc
    • File
      cache.h
    • File
      cache_kernels.cu
    • File
      torch_bindings.cpp
  • tests/kernels
    • File
      test_triton_decode_attention.py
  • vllm
    • File
      _custom_ops.py
    • attention
      • backends
        • File
          abstract.py
        • mla
          • File
            __init__.py
          • File
            utils.py
        • File
          triton_mla.py
      • File
        layer.py
      • ops
        • File
          triton_decode_attention.py
      • File
        selector.py
    • File
      config.py
    • engine
      • File
        arg_utils.py
    • File
      envs.py
    • model_executor
      • model_loader
        • File
          loader.py
      • models
        • File
          deepseek_v2.py
    • platforms
      • File
        cpu.py
      • File
        cuda.py
      • File
        hpu.py
      • File
        interface.py
      • File
        openvino.py
      • File
        rocm.py
      • File
        tpu.py
      • File
        xpu.py
    • worker
      • File
        cache_engine.py
      • File
        model_runner.py
Loading