llama.cpp
ggml: initial IBM zDNN backend
#14975
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
91
Changes
View On
GitHub
ggml: initial IBM zDNN backend
#14975
taronaeo
merged 91 commits into
ggml-org:master
from
taronaeo:feat/backend-zdnn
ggml-zdnn: inital backend impl
e084821a
ggml-zdnn: tensor->extra logging check
fd4914b0
ggml-zdnn: add output buffer check
02cfcfb2
ggml-zdnn: run compute and store into tensor->extra
36d76c30
ggml-zdnn: add set_tensor
1989fc9b
ggml-zdnn: add more loggers
b9756b6d
ggml-zdnn: update set_tensor logging to check only for matmul
60b9874d
ggml-zdnn: last working matmul version
529bdb9f
ggml-zdnn: add comments to prevent accidentally deleting lines
11d58d29
ggml-zdnn: support op out_prod
77a75329
ggml-zdnn: update op out_prod to use tensor->extra
04ddb2ac
ggml-zdnn: rewrite the backend implementation
7c6395f8
ggml-zdnn: bugfix new impl
ae2f656d
ggml-zdnn: fix compiler warnings and bugfixes
af9f4f00
ggml-zdnn: test ztensor finding in init_tensor
9e84742e
ggml-zdnn: implement at least 1 op to test
13c05872
ggml-zdnn: assign tensor->extra to buffer
13c64448
ggml-zdnn: add check for view tensors to prevent init_tensor
ee0ed78d
ggml-zdnn: rework init_tensor to create new buffers
b7f4b6fd
ggml-zdnn: switch to std vector instead of array
63fbc45e
ggml-zdnn: switch buffers back and set to arbitrary number
da2e0e70
ggml-zdnn: impl init_tensor
18658b86
ggml-zdnn: update supports_op matmul matrix
82851965
ggml-zdnn: fix incorrect ztensor shape, reduce memory padding
c1653ab6
ggml-zdnn: code clean up
59e9805a
ggml-zdnn: impl matmul
a1d8568c
ggml-zdnn: fix compiler error missing type
1c75ed63
ggml-zdnn: fix missing data transform call
f263f5d9
ggml-zdnn: add bias init_tensor
aef93b39
ggml-zdnn: tighten memory usage, change string allocation
bee7dd30
ggml-zdnn: add bias ztensor and data free
f800c802
ggml-zdnn: add bias data transform
4b2f1cb1
ggml-zdnn: add more debug info for extra buffer transform
6d71749c
ggml-zdnn: add logger to check if mat mul ops go through set_tensor
f7e8d6f2
ggml-zdnn: activate bias transform in matmul
092fa3a3
ggml-zdnn: move weights transform into mulmat
f239bbb0
ggml-zdnn: add more safeguards in matmul
cf0e190c
ggml-zdnn: fix sequencing of transforms
032dce5a
ggml-zdnn: bugfix transform ztensor vs origtensor
08de84ef
ggml-zdnn: figure out why sigtrap is happening
fc692ed4
ggml-zdnn: fix sigsegv
eefa943b
ggml-zdnn: move everything back to local declaration
6f425701
ggml-zdnn: move bias data to local also
4cc62cb6
ggml-zdnn: bring back working matmul
03ec5d3e
ggml-zdnn: rewrite into mre
09051683
ggml-zdnn: fix missing vector import
f99b274c
ggml-zdnn: fix missing vector import in header
e0549c29
ggml-zdnn: attempt to fix sigsegv
fc9260de
ggml-zdnn: fix missing load tensor
2cfa118f
ggml-zdnn: fix invalid ztensor buffer release
2872276d
ggml-zdnn: add logging to debug free buffer
1a0520a5
ggml-zdnn: remove free_buffer debug info
1c6ca76c
ggml-zdnn: add parmblkformat detections
a9438925
ggml-zdnn: add nnpa installed detection
0ae2d303
ggml-zdnn: add zdnn_init call for static libs
ab60ae6c
ggml-zdnn: add init_tensor
2d45ee25
ggml-zdnn: attempt at fixing invalid buffer
34468074
ggml-zdnn: switch to using deque to fix pointer deref problem
b28b4238
ggml-zdnn: add weights logging to check
b1376ad0
ggml-zdnn: attempt to use unique ptr
8dbca74f
ggml-zdnn: add tensor to pre_tfm_desc logging
e695e857
ggml-zdnn: add inputs logging
213f1d2a
ggml-zdnn: disable op_none initialisation for testing
4493b148
ggml-zdnn: fix missing return from init_tensor
e30b1ffb
ggml-zdnn: load ztensors in cgraph exec
fd766bdd
ggml-zdnn: work on moving output ztensor as well
b4dffed9
ggml-zdnn: disable logging and breakpoints for full test
ad0cb302
ggml-zdnn: attempt at manually changing the layout
7b50d057
ggml-zdnn: attempt at using default nwhc format instead
4fb6bee1
ggml-zdnn: disable global load ztensor for now
20d69b6c
ggml-zdnn: fix errorenous output load tensor
4d5edb22
ggml-zdnn: add guards to prevent loading ztensor if transformed
b7a77cf6
ggml-zdnn: code cleanup
1eb7c35e
ggml-zdnn: bring load ztensor back to init routine
70224e6c
ggml-zdnn: code clean up
803dde3b
ggml-zdnn: fix ztensor deallocation abort
e67feafc
ggml-zdnn: clean up matmul selection
90d460c2
ggml-zdnn: clean up project structure
92a17ed9
ggml-zdnn: update documentation, prepare for upstream
cf8cdcd3
chore: add codeowners
867d3f32
Merge branch 'master' into feat/backend-zdnn
12e6b8b6
github-actions
added
documentation
github-actions
added
devops
github-actions
added
ggml
ggml-zdnn: disable batched matmul
732df731
ggml-zdnn: attempt at fixing tensor views during matmul
6b6ebb9b
ggml-zdnn: deny all view tensors directly
fb0241bc
slaren
commented on 2025-08-08
ggml-zdnn: fix pr comments
e3904152
docs: update ops docs for zdnn
c3d2096a
taronaeo
requested a review
from
slaren
220 days ago
slaren
commented on 2025-08-14
ggml-zdnn: redo test-backend-ops for ops.md
1746e0c7
ggml-zdnn: fix typo in build-s390x.md
e1fa4f2e
codeowners: remove taronaeo for now
411ea4ed
Revert "codeowners: remove taronaeo for now"
09c1c1cc
slaren
approved these changes on 2025-08-14
ggerganov
approved these changes on 2025-08-14
ggml-zdnn: remove unused ggml_zdnn macro
c9e88604
taronaeo
merged
ff27f80a
into master
219 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
slaren
Assignees
No one assigned
Labels
documentation
devops
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub