ggml
sync : llama.cpp
#1311
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
23
Changes
View On
GitHub
sync : llama.cpp
#1311
ggerganov
merged 23 commits into
master
from
sync-llama.cpp-25-07-24
Vulkan: Fix fprintf format-security warning (llama/14770)
1d1866d7
vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#132…
c5262c2c
ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14…
d45d283a
vulkan/cuda: Fix im2col when KW!=KH (llama/14789)
27435226
kleidiai: add support for get_rows (llama/14676)
aa65fde7
sycl: Fix im2col (llama/14797)
de18e9a7
opencl: add conv2d kernel (llama/14403)
58f48321
opencl: fix `im2col` when `KW!=KH` (llama/14803)
64088bbc
cuda: remove linking to cublasLt (llama/14790)
5485663d
opencl: remove unreachable `return` (llama/14806)
5b97e8ed
cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763)
1dad821b
vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817)
6bc9c977
CUDA: add fused rms norm (llama/14800)
b06d9cbf
CANN: weight format to NZ for Ascend310P3 (llama/14407)
0ed1969e
ggml: fix loongarch quantize_row_q8_1 error (llama/14827)
a44689f0
tests : add non-cont K,V FA tests
d4951584
CUDA: fix quantized KV cache + multiple sequences (llama/14822)
e27e2cd6
CUDA: fix compilation with GGML_CUDA_F16 (llama/14837)
21c3ebd0
CUDA: fix overflow in FA, tune performance (llama/14840)
5d1cc399
sycl: fix undefined variable in work group size check (llama/14843)
1d54f61a
metal : fix fusion across different encoders (llama/14849)
e2829867
sycl: fixed semantics of block offset calculation (llama/14814)
8dcd3dc4
sync : llama.cpp
ee456e8d
danbev
approved these changes on 2025-07-24
ggerganov
merged
ac842675
into master
161 days ago
ggerganov
deleted the sync-llama.cpp-25-07-24 branch
161 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
danbev
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub