sync : ggml #2868

ggerganov merged 38 commits into master from sync-ggml-25-03-08
ggerganov
petterreinholdtsen Told cmake to install ggml-cpp.h as a public header file. (ggml/1126)
e1df78f2
cmdr2 cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
0ad15e3f
cmdr2 cuda/vulkan: specify fp32-only support for some operations in support…
a34489ce
cmdr2 cuda: unary ops as float + de-duplicate (ggml/1130)
0c02f62d
MollySophia ggml-cpu: Fix build with sve (llama/12059)
0894863c
jeffbolznv vulkan: fix assertion when qy_needs_dequant (llama/12068)
efe4017b
vvuksanovic cmake: Fix ggml backend dependencies and installation (llama/11818)
6e7f839f
daniandtheweb vulkan: improve im2col (llama/11826)
c270cead
netrunnereve vulkan: matmul dequantization improvements (llama/12015)
cd143b89
hipudding CANN: Fix build error with GCC 13 (llama/11990)
a252113f
Vithulep ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (llama/…
be928d35
JohannesGaessler CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (llama/12098)
00b059e1
remyoudompheng vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizatio…
4f45b431
WilliamTambellini ggml : upgrade init_tensor API to return a ggml_status (llama/11854)
27243464
Green-Sky CUDA: compress mode option and default to size (llama/12029)
85e04616
slaren ggml-backend : keep paths in native string type when possible (llama/…
aa27e01b
qnixsynapse SYCL: Move CPY kernels to a separate file and add few missing kernels…
51679796
ag2s20150909 ggml : fix kleidiai build (llama/12159)
79146701
hjc4869 HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (llama/…
9a517d61
mgroeber9110 ggml : portability fixes for VS 2017 (llama/12150)
e38d29cc
ggerganov vulkan : sync (llama/0)
66e42fa3
vmobilis ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/…
ea027a44
pminev ggml : fix GGMLMetalClass ODR (llama/12200)
6979ba82
qnixsynapse SYCL: Disable f16 Unary OPs as not supported by the kernels (llama/12…
c19d8eb2
remyoudompheng ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (lla…
2b2c567f
simon886212 opencl : fix profile-related errors (llama/12095)
dccfba4c
linehill opencl : fix `ulong` kernel args were set from `int` variables (llama…
03b31a0c
linehill opencl : fix buffer alignment (llama/12197)
c6e70f7e
IMbackK HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of …
8a3a93e6
JohannesGaessler CUDA: fix FA logic for PTX 7.0 and CC >= 7.5 (llama/12222)
425cefdf
hbuxiaofei cmake : fix undefined reference errors for std::filesystem in ggml (#…
f7551669
lhez opencl: Noncontiguous `norm`, `rms_norm`, disable `fp16` for some ops…
dbf9384f
danbev metal : fix default.metallib build (llama/12224)
41079a9a
BB-fat metal : simplify kernel arguments using a struct (ggml/3229) (llama/1…
5141a2c9
remyoudompheng ggml-cpu: faster AVX2 variant for IQ1_M (llama/12216)
16754cd2
ggerganov sync : ggml
0b4956d1
ggerganov cmake : fix ggml-config (ggml/0)
ed5c4942
ggerganov objc : fix build, tmp remove GPU support, use C++17
209e1f37
ggerganov ggerganov force pushed from a102b03e to 209e1f37 358 days ago
ggerganov ggerganov merged 7d140057 into master 358 days ago
ggerganov ggerganov deleted the sync-ggml-25-03-08 branch 358 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone