vllm
de0526f6 - [Misc][Quark] Upstream Quark format to VLLM (#10765)

Commit
210 days ago
[Misc][Quark] Upstream Quark format to VLLM (#10765) Signed-off-by: kewang-xlnx <kewang@xilinx.com> Signed-off-by: kewang2 <kewang2@amd.com> Co-authored-by: kewang2 <kewang2@amd.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
Author
Parents
  • tests/quantization
    • File
      test_quark.py
  • vllm
    • File
      config.py
    • model_executor
      • layers
        • File
          linear.py
        • quantization
          • File
            __init__.py
          • File
            base_config.py
          • compressed_tensors
            • File
              compressed_tensors.py
            • File
              triton_scaled_mm.py
            • File
              utils.py
          • quark
            • File
              __init__.py
            • File
              quark.py
            • File
              quark_moe.py
            • schemes
              • File
                __init__.py
              • File
                quark_scheme.py
              • File
                quark_w8a8_fp8.py
              • File
                quark_w8a8_int8.py
            • File
              utils.py
      • models
        • File
          aria.py
        • File
          commandr.py
        • File
          dbrx.py
        • File
          exaone.py
        • File
          gemma2.py
        • File
          gpt_j.py
        • File
          granite.py
        • File
          llama.py
        • File
          mixtral.py
        • File
          mllama.py
        • File
          nemotron.py
        • File
          phimoe.py
        • File
          qwen2.py
        • File
          solar.py
      • File
        parameter.py
    • platforms
      • File
        rocm.py