llama.cpp
ggml: initial IBM zDNN backend
#14975
Merged

ggml: initial IBM zDNN backend #14975

taronaeo
taronaeo ggml-zdnn: inital backend impl
e084821a
taronaeo ggml-zdnn: tensor->extra logging check
fd4914b0
taronaeo ggml-zdnn: add output buffer check
02cfcfb2
taronaeo ggml-zdnn: run compute and store into tensor->extra
36d76c30
taronaeo ggml-zdnn: add set_tensor
1989fc9b
taronaeo ggml-zdnn: add more loggers
b9756b6d
taronaeo ggml-zdnn: update set_tensor logging to check only for matmul
60b9874d
taronaeo ggml-zdnn: last working matmul version
529bdb9f
taronaeo ggml-zdnn: add comments to prevent accidentally deleting lines
11d58d29
taronaeo ggml-zdnn: support op out_prod
77a75329
taronaeo ggml-zdnn: update op out_prod to use tensor->extra
04ddb2ac
taronaeo ggml-zdnn: rewrite the backend implementation
7c6395f8
taronaeo ggml-zdnn: bugfix new impl
ae2f656d
taronaeo ggml-zdnn: fix compiler warnings and bugfixes
af9f4f00
taronaeo ggml-zdnn: test ztensor finding in init_tensor
9e84742e
taronaeo ggml-zdnn: implement at least 1 op to test
13c05872
taronaeo ggml-zdnn: assign tensor->extra to buffer
13c64448
taronaeo ggml-zdnn: add check for view tensors to prevent init_tensor
ee0ed78d
taronaeo ggml-zdnn: rework init_tensor to create new buffers
b7f4b6fd
taronaeo ggml-zdnn: switch to std vector instead of array
63fbc45e
taronaeo ggml-zdnn: switch buffers back and set to arbitrary number
da2e0e70
taronaeo ggml-zdnn: impl init_tensor
18658b86
taronaeo ggml-zdnn: update supports_op matmul matrix
82851965
taronaeo ggml-zdnn: fix incorrect ztensor shape, reduce memory padding
c1653ab6
taronaeo ggml-zdnn: code clean up
59e9805a
taronaeo ggml-zdnn: impl matmul
a1d8568c
taronaeo ggml-zdnn: fix compiler error missing type
1c75ed63
taronaeo ggml-zdnn: fix missing data transform call
f263f5d9
taronaeo ggml-zdnn: add bias init_tensor
aef93b39
taronaeo ggml-zdnn: tighten memory usage, change string allocation
bee7dd30
taronaeo ggml-zdnn: add bias ztensor and data free
f800c802
taronaeo ggml-zdnn: add bias data transform
4b2f1cb1
taronaeo ggml-zdnn: add more debug info for extra buffer transform
6d71749c
taronaeo ggml-zdnn: add logger to check if mat mul ops go through set_tensor
f7e8d6f2
taronaeo ggml-zdnn: activate bias transform in matmul
092fa3a3
taronaeo ggml-zdnn: move weights transform into mulmat
f239bbb0
taronaeo ggml-zdnn: add more safeguards in matmul
cf0e190c
taronaeo ggml-zdnn: fix sequencing of transforms
032dce5a
taronaeo ggml-zdnn: bugfix transform ztensor vs origtensor
08de84ef
taronaeo ggml-zdnn: figure out why sigtrap is happening
fc692ed4
taronaeo ggml-zdnn: fix sigsegv
eefa943b
taronaeo ggml-zdnn: move everything back to local declaration
6f425701
taronaeo ggml-zdnn: move bias data to local also
4cc62cb6
taronaeo ggml-zdnn: bring back working matmul
03ec5d3e
taronaeo ggml-zdnn: rewrite into mre
09051683
taronaeo ggml-zdnn: fix missing vector import
f99b274c
taronaeo ggml-zdnn: fix missing vector import in header
e0549c29
taronaeo ggml-zdnn: attempt to fix sigsegv
fc9260de
taronaeo ggml-zdnn: fix missing load tensor
2cfa118f
taronaeo ggml-zdnn: fix invalid ztensor buffer release
2872276d
taronaeo ggml-zdnn: add logging to debug free buffer
1a0520a5
taronaeo ggml-zdnn: remove free_buffer debug info
1c6ca76c
taronaeo ggml-zdnn: add parmblkformat detections
a9438925
taronaeo ggml-zdnn: add nnpa installed detection
0ae2d303
taronaeo ggml-zdnn: add zdnn_init call for static libs
ab60ae6c
taronaeo ggml-zdnn: add init_tensor
2d45ee25
taronaeo ggml-zdnn: attempt at fixing invalid buffer
34468074
taronaeo ggml-zdnn: switch to using deque to fix pointer deref problem
b28b4238
taronaeo ggml-zdnn: add weights logging to check
b1376ad0
taronaeo ggml-zdnn: attempt to use unique ptr
8dbca74f
taronaeo ggml-zdnn: add tensor to pre_tfm_desc logging
e695e857
taronaeo ggml-zdnn: add inputs logging
213f1d2a
taronaeo ggml-zdnn: disable op_none initialisation for testing
4493b148
taronaeo ggml-zdnn: fix missing return from init_tensor
e30b1ffb
taronaeo ggml-zdnn: load ztensors in cgraph exec
fd766bdd
taronaeo ggml-zdnn: work on moving output ztensor as well
b4dffed9
taronaeo ggml-zdnn: disable logging and breakpoints for full test
ad0cb302
taronaeo ggml-zdnn: attempt at manually changing the layout
7b50d057
taronaeo ggml-zdnn: attempt at using default nwhc format instead
4fb6bee1
taronaeo ggml-zdnn: disable global load ztensor for now
20d69b6c
taronaeo ggml-zdnn: fix errorenous output load tensor
4d5edb22
taronaeo ggml-zdnn: add guards to prevent loading ztensor if transformed
b7a77cf6
taronaeo ggml-zdnn: code cleanup
1eb7c35e
taronaeo ggml-zdnn: bring load ztensor back to init routine
70224e6c
taronaeo ggml-zdnn: code clean up
803dde3b
taronaeo ggml-zdnn: fix ztensor deallocation abort
e67feafc
taronaeo ggml-zdnn: clean up matmul selection
90d460c2
taronaeo ggml-zdnn: clean up project structure
92a17ed9
taronaeo ggml-zdnn: update documentation, prepare for upstream
cf8cdcd3
taronaeo chore: add codeowners
867d3f32
taronaeo Merge branch 'master' into feat/backend-zdnn
12e6b8b6
github-actions github-actions added documentation
github-actions github-actions added devops
github-actions github-actions added ggml
taronaeo ggml-zdnn: disable batched matmul
732df731
taronaeo ggml-zdnn: attempt at fixing tensor views during matmul
6b6ebb9b
taronaeo ggml-zdnn: deny all view tensors directly
fb0241bc
taronaeo
ggerganov
slaren
slaren commented on 2025-08-08
taronaeo
taronaeo ggml-zdnn: fix pr comments
e3904152
taronaeo docs: update ops docs for zdnn
c3d2096a
taronaeo
taronaeo taronaeo requested a review from slaren slaren 220 days ago
slaren
slaren commented on 2025-08-14
taronaeo ggml-zdnn: redo test-backend-ops for ops.md
1746e0c7
taronaeo ggml-zdnn: fix typo in build-s390x.md
e1fa4f2e
taronaeo codeowners: remove taronaeo for now
411ea4ed
taronaeo Revert "codeowners: remove taronaeo for now"
09c1c1cc
slaren
slaren approved these changes on 2025-08-14
ggerganov
ggerganov approved these changes on 2025-08-14
taronaeo ggml-zdnn: remove unused ggml_zdnn macro
c9e88604
taronaeo
taronaeo
slaren
taronaeo taronaeo merged ff27f80a into master 219 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone