onnxruntime
Route fp16 HQNBIT_CompInt8 (4-bit and 8-bit) through fp32 MLAS path in MatMulNBits
#27820
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
Commits
Route fp16 HQNBIT_CompInt8 through fp32 MLAS path for 4-bit and 8-bit
jambayk
committed
91 days ago
Remove dead HQ4BitGemm_CompInt8 and HQ8BitGemm_CompInt8 MLAS code
jambayk
committed
90 days ago
lint
jambayk
committed
90 days ago
Fix HQNBIT_CompInt8 PrePack bugs for 4-bit and 8-bit
jambayk
committed
90 days ago
Address review: ORT_ENFORCE for scales, move SQNBIT check to GetComputeType
jambayk
committed
90 days ago
Loading