optimum
53240c3f - Allow GPTQModel to auto select Marlin or faster kernels for inference only ops (#2138)

Commit
340 days ago
Allow GPTQModel to auto select Marlin or faster kernels for inference only ops (#2138) * select quant_linear with pack * up GPTQMODEL_MINIMUM_VERSION * Update quantizer.py * update gptqmodel version --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Parents
Loading