ggml
sync : llama.cpp
#1160
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
56
Changes
View On
GitHub
sync : llama.cpp
#1160
ggerganov
merged 56 commits into
master
from
sync-llama.cpp-25-03-27
ggml : skip intermediate .air file when compiling .metallib (llama/12…
975e013e
ggml-backend : make path_str compatible with C++20 (llama/12269)
6f1482f0
tests : fix test-quantize-fns to init the CPU backend (llama/12306)
3b9ab121
opencl: use OpenCL C standard supported by the device (llama/12221)
bd498fb4
musa: support new arch mp_31 and update doc (llama/12296)
f5489240
mat vec double buffer (llama/12188)
0a9761e9
metal : Cache the Metal library at the device context level (llama/12…
9091eeae
ggml-backend : fix backend search path (llama/12330)
60bd86a0
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows …
51e47802
vulkan: fix bug in coopmat1 mul_mat_id (llama/12316)
fbedb178
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (llama/12315)
505422f2
sycl : variable sg_size support for mmvq kernels (llama/12336)
ab5b0d11
MUL_MAT optimization (llama/12382)
eb84db8f
SYCL : support non-contiguous tensors in binary ops (add, sub, etc) (…
96c5d142
SYCL: Delete redundant plus sign and space (llama/12391)
afbf61d5
SYCL: set extras only on GGML_TYPE_Q4_0 (llama/12366)
1c8153d1
cmake : enable building llama.cpp using system libggml (llama/12321)
f6bf093d
vulkan: Adjust coopmat2 tile sizes and selection heuristic (llama/12258)
5c70888c
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bound…
64386ff0
vulkan: use fp32 in coopmat2 q4_k dequant function (llama/12309)
a373dd2b
vulkan: subgroup size tuning (llama/12087)
2662d5da
vulkan: Add N/2 and N/4 optimized paths in coopmat2 shader (llama/12312)
26c8697c
ggml-vulkan: remove unused find_program(glslc) (llama/12416)
915c4f8a
cuda : enable CUDA Graph on CUDA Toolkit < 12.x (llama/12394)
5aef5af2
llama: Add support for RWKV v7 architecture (llama/12412)
f8d81ea6
fixed compilation warnings in ggml-sycl (llama/12424)
7018b32b
Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentat…
3eb60975
ggml : add SVE support for q6_K_q8_K (llama/12361)
6a168b10
SYCL: using graphs is configurable by environment variable and compil…
245b76f4
musa: override warp_size of musa device to 32 (llama/12445)
2a94810e
opencl: improve profiling (llama/12442)
0a05a500
vulkan: Submit once enough matmul work has been recorded (llama/12406)
9f34a512
Fix visionOS build and add CI (llama/12415)
cb26fbfb
vulkan: optimize iq1 coopmat2 dequant functions (llama/12427)
39748c11
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (llam…
2a02e67a
ggml : block interleaving support for Q4_K quantization for x86 AVX2 …
bb1d5da2
sycl: cleanup oneDNN related code (llama/12097)
e8d18a49
Fix build on Windows when ccache enabled (#9954) (llama/9976)
61b74517
vulkan: workaround for AMD Windows driver 16 bit unpack8 bug (llama/1…
d44565f6
Vulkan: RTE rounding for cpy to quant (llama/12480)
f7be1eb5
vulkan: Optimize mul_mat_vec p021 and nc shaders (llama/12505)
60beda8c
musa: refine compute capability (llama/12493)
d385fc7f
ggml : fix quantized cpy op (llama/12310)
80ce83df
vulkan: fix mul_mat_vec failure in backend tests (llama/12529)
51937836
CUDA: Fix clang warnings (llama/12540)
3b5918f2
opencl: simplify kernel embedding logic in cmakefile (llama/12503)
935dab3a
SYCL: disable Q4_0 reorder optimization (llama/12560)
e52a9edf
ggml-cpu : update KleidiAI to v1.5.0 (llama/12568)
25177579
ggml : fix MUL_MAT_ID repack with Q8_K (llama/12544)
1f4255ce
metal : refactor mat-vec code (llama/12569)
6f57c7b0
HIP: Add support for RDNA4 targets (llama/12372)
44fd23e5
SYCL: implement memset ggml backend buffer interface (llama/12580)
5bc40006
llamafile : ppc64le MMA implementation for Q4_0. (llama/12489)
ff66af54
ggml : sync/merge cmake,riscv,powerpc, add common.cmake (#0)
89205af3
sync : llama.cpp
4c4f07ab
files : remove old wkv6 sources (#0)
c838c22e
ggerganov
merged
660def06
into master
1 year ago
ggerganov
deleted the sync-llama.cpp-25-03-27 branch
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub