sync : ggml #3665

ggerganov merged 35 commits into master from sync-ggml-26-02-16
ggerganov
ORippler CUDA: Fix non-contig rope (llama/19338)
b9c178d4
ggerganov cuda : extend GGML_OP_PAD to work with non-cont src0 (llama/19429)
0cbf122c
hipudding CANN: implement quantized MUL_MAT_ID for MoE models (llama/19228)
fd5cfe14
rauletorresc CANN: Remove unnecessary wrapper for `gml_backend_buft_is_cann` (llam…
30b52c6a
k4ss4n ggml : use noexcept overload for is_regular_file in backend registrat…
2f2b2f5d
Alcpz ggml-cpu: arm64: q6_K repack gemm and gemv (and generic) implementati…
0d0bf3a9
nikhilJain17 Plug memory leaks and free resources on shutdown (llama/19315)
3415a2e4
ORippler CUDA : Update CCCL-tag for 3.2 to final release from RC (llama/19486)
99418ad3
ggerganov metal : consolidate unary ops (llama/19490)
be449af3
ggerganov ggml : extend bin bcast for permuted src1 (llama/19484)
012bd607
max-krasnyansky hexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU (llama/19406)
b26b2b19
ggerganov metal : extend l2_norm support for non-cont src0 (llama/19502)
81aa77c3
ggerganov ggml : unary ops support non-cont src0 + metal F16 unary ops (llama/1…
6f46caea
lhez opencl: add general Q6_K mm and Q4_K mv (llama/19347)
bf58e6f3
max-krasnyansky hexagon: further optimization and tuning of matmul and dot kernels (l…
faa42605
superm1 Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (llama/1…
771827e8
ggerganov metal : update sum_rows kernel to support float4 (llama/19524)
91bbbdf6
lhez opencl: add basic support for q4_1 (llama/19534)
a0bcad8d
FanShupei hexagon: fix typo in vtcm_needs_release (llama/19545)
8e43b5a5
ggerganov metal : support GGML_OP_SET (llama/19548)
4ed6faf2
ggerganov metal : improve concurrency (llama/19555)
31c389d0
ORippler CUDA: Do not mutate cgraph for fused ADDs (llama/19566)
76896039
am17an CUDA: loop over ne2*ne3 in case it overflows (llama/19538)
9bdef053
ymcki fix vulkan ggml_acc only works in 3d but not 4d (llama/19426)
8f9ca9ce
Alcpz Fix wrong memcpy length for block_interleave == 4 (llama/19575)
0ae3e1df
jeffbolznv vulkan: restore -inf check in FA shaders (llama/19582)
fd765784
max-krasnyansky hexagon: further optimizations and refactoring for flash attention (l…
cacd47af
strongtz vulkan: Add vendor id for Qualcomm drivers (llama/19569)
73d40946
jeffbolznv vulkan: support GGML_OP_SET (llama/19584)
b915a235
jeffbolznv vulkan: support L2_NORM with contiguous rows (llama/19604)
c2655cfe
ggerganov metal : fix ACC op (llama/19427)
0d9def9f
angt ggml : fix GGML_DEBUG with OpenMP (llama/19599)
4a206c84
ggerganov models : optimize qwen3next graph (llama/19375)
07db21f8
ggerganov sync : ggml
5a11c727
ggerganov talk-llama : sync llama.cpp
3413099c
ggerganov ggerganov merged 364c77f4 into master 19 days ago
ggerganov ggerganov deleted the sync-ggml-26-02-16 branch 19 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone