vllm
de0526f6
- [Misc][Quark] Upstream Quark format to VLLM (#10765)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Previous Change (CTRL+↑)
Next Change (CTRL+↓)
Expand Context Lines
Collapse Context Lines
Hide Minimap (CTRL+M)
Commit
210 days ago
[Misc][Quark] Upstream Quark format to VLLM (#10765) Signed-off-by: kewang-xlnx <kewang@xilinx.com> Signed-off-by: kewang2 <kewang2@amd.com> Co-authored-by: kewang2 <kewang2@amd.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
References
#10765 - [Misc][Quark] Upstream Quark format to VLLM
Author
kewang-xlnx
Parents
5ecf3e0a
Files
32
tests/quantization
test_quark.py
vllm
config.py
model_executor
layers
linear.py
quantization
__init__.py
base_config.py
compressed_tensors
compressed_tensors.py
triton_scaled_mm.py
utils.py
quark
__init__.py
quark.py
quark_moe.py
schemes
__init__.py
quark_scheme.py
quark_w8a8_fp8.py
quark_w8a8_int8.py
utils.py
models
aria.py
commandr.py
dbrx.py
exaone.py
gemma2.py
gpt_j.py
granite.py
llama.py
mixtral.py
mllama.py
nemotron.py
phimoe.py
qwen2.py
solar.py
parameter.py
platforms
rocm.py
Loading