llama.cpp
Add experimental ggml-hexagon backend for the Hexagon NPU
#16547
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
37
Changes
View On
GitHub
Add experimental ggml-hexagon backend for the Hexagon NPU
#16547
max-krasnyansky
merged 37 commits into
ggml-org:master
from
CodeLinaro:hexagon
github-actions
added
documentation
github-actions
added
ggml
max-krasnyansky
requested a review
from
lhez
64 days ago
max-krasnyansky
force pushed
63 days ago
max-krasnyansky
force pushed
62 days ago
github-actions
added
devops
max-krasnyansky
force pushed
62 days ago
max-krasnyansky
force pushed
61 days ago
github-actions
added
script
ggerganov
commented on 2025-10-16
max-krasnyansky
marked this pull request as ready for review
59 days ago
max-krasnyansky
requested a review
from
CISC
59 days ago
max-krasnyansky
requested a review
from
slaren
59 days ago
github-actions
added
python
max-krasnyansky
force pushed
59 days ago
max-krasnyansky
force pushed
57 days ago
slaren
commented on 2025-10-21
slaren
commented on 2025-10-22
ggerganov
commented on 2025-10-22
model: add support for extra bufs for all devices
06253824
hexagon: add experimental ggml-hexagon backend for the Hexagon NPU
80dc8e80
hexagon: fix format checker errors
ec4436f7
hexagon: update readme and cmake presets
aa65f212
ci: add android-ndk-build jobs that build plain ARM64 and Snapdragon …
647fa3de
hexagon: add simple graph optimizer for stacking MUL_MAT ops with the…
da7caac4
hexagon: move ADB helper scripts into scripts/snapdragon/adb
bbbc8eae
hexagon: replace all f/printfs with GGML_LOG_...
cc7dbd4b
readme: add hexagon to the list supported backends
69a8047e
hexagon: stack malmuts with quantized inputs only
debdb3b4
hexagon: add TODO for fixing issues in hexagon_graph_optimize
3475e29b
hexagon: update to hex-sdk 6.4.0 and add scripts for running on QDC
1e750df0
scripts: fix lint errors
8e7d8b5a
scripts: update qdc pytest script to make linter happy
20aa6897
hexagon: add reduce sum in fp32
03e2b9c3
hexagon: reduce number of vector stores in matmul output
384164dc
hexagon: remove the need for vdelta in reduce-multiply-x8
a314eb69
hexagon: consistent use of reduce_sum_fp32 for row_sums
7f2d00bd
hexagon: some more matmul optimizations and comments
5de19f8b
hexagon: update cmake presets
cf0242e3
hexagon: add OPMASK support for run-bench.sh wrapper
250e3a66
hexagon: update to use GGML_BACKEND_API
08a97e63
hexagon: remove unused logic for setting tensor flags for the views
6d2d0bd2
hexagon: add asserts to set/get_tensor to make sure we handle complet…
18d7d204
hexagon: use cpy_tensor slow path for non-host buffers
26a90a0b
hexagon: error checks in the buffer allocator
a8e5ad82
cmake: move include(extProj) under ggml-hexagon
dc001b9f
hexagon: don't forget to delete the backend on free
c749b869
hexagon: set/get_tensor size assert apply only to quantized tensors
0c01229e
hexagon: reintroduce HEX_VERBOSE wrapper for GGML_LOG_DEBUG for now
62ef4eba
docs: typos in hexagon developer docs (libggm-...)
19041f7d
hexagon: overhaul error handling in the session/device allocation
3e4ff739
max-krasnyansky
force pushed
to
3e4ff739
54 days ago
ggerganov
commented on 2025-10-22
hexagon: update cmake presets to enable fp16 vectors
6acc2854
slaren
commented on 2025-10-22
hexagon: remove unused time_usec function
dda466cf
slaren
approved these changes on 2025-10-22
hexagon: don't forget to release buffer contexts
b0e5beb9
hexagon: fixed indents in hvx-utils (missed clang-format auto-format …
3049de50
hexagon: remove custom can_repeat function and use ggml_can_repeat
f7d74118
max-krasnyansky
merged
63d2fc46
into master
54 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
slaren
ggerganov
jeffbolznv
lhez
CISC
Assignees
No one assigned
Labels
documentation
script
python
devops
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub