onnxruntime
Add fp16 support for 8-bit MatMulNBits on ARM64 and fix pre-existing bugs
#27692

Merged

Add fp16 support for 8-bit MatMulNBits on ARM64 and fix pre-existing bugs #27692

jambayk merged 12 commits into main from jambayk/mnb-arm-16

Add HQNBIT_CompFp16 support for 8-bit MatMulNBits on ARM64 NEON

93529bc8

acc 4 support for HQ

566df8df

fix bias bug, compint8 support for 8 bits

cd67587a

remove MLAS_TARGET_AMD64_IX86 guard on QuantBDataWorkspace

128908ae

fix scale packing

8415246a

only pack for 8 bit

6332d94a

jambayk requested a review from

copilot-pull-request-reviewer 11 days ago

jambayk requested a review from

hariharans29 11 days ago

copilot-pull-request-reviewer commented on 2026-03-17

jambayk requested a review from

copilot-pull-request-reviewer 11 days ago

jambayk commented on 2026-03-17

copilot-pull-request-reviewer commented on 2026-03-17

address reviews

218a6f9b

jambayk requested a review from

copilot-pull-request-reviewer 11 days ago

copilot-pull-request-reviewer commented on 2026-03-17

more reviews

ce7d5335

fix DequantB8Bit reference in dequant test

d4064093

Fix Float16_8b_ARM_CompFp16 SIGTRAP in Debug builds

886971a8

hariharans29 commented on 2026-03-17

Add fp16 CompInt8 8-bit tests and improve N/K/BlockSize coverage

8359fe0b

hariharans29 dismissed these changes on 2026-03-17

jambayk enabled auto-merge (squash) 10 days ago

Increase fp16 8-bit test tolerances for large-K cases

e0a1834d

jambayk dismissed their stale review via e0a1834d 10 days ago

jambayk requested a review from

hariharans29 10 days ago

hariharans29 approved these changes on 2026-03-17

jambayk merged c1f38c03 into main 10 days ago

jambayk deleted the jambayk/mnb-arm-16 branch 10 days ago

Reviewers

hariharans29

copilot-pull-request-reviewer

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

onnxruntime Add fp16 support for 8-bit MatMulNBits on ARM64 and fix pre-existing bugs #27692 Merged

Add fp16 support for 8-bit MatMulNBits on ARM64 and fix pre-existing bugs #27692

onnxruntime
Add fp16 support for 8-bit MatMulNBits on ARM64 and fix pre-existing bugs
#27692

Merged