vllm
b129136c - [ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008)

Commit

3 days ago

[ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008) Signed-off-by: xuebwang-amd <xuebwang@amd.com> Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>

References

#29008 - [ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations

Author

xuebwang-amd

Parents

599e4335

vllm b129136c - [ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008)

vllm
b129136c - [ROCm][Quantization] GPT_OSS in amd-quark format model loading and emulations (#29008)