sync : ggml #2779

ggerganov merged 53 commits into master from sync-ggml-25-02-03
ggerganov
issixx ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. …
ab7df17f
WilliamTambellini ggml : add option to not print stack on abort (ggml/1081)
eba54c73
qnixsynapse SYCL: Add gated linear attention kernel (llama/11175)
bda34512
JohannesGaessler RoPE: fix back, CUDA support for back + noncont. (llama/11240)
11eb7f4b
sparkleholic fix: ggml: fix vulkan-shaders-gen build (llama/10448)
3a1f8339
netrunnereve vulkan: scale caching for k quants + misc fixes (llama/11081)
a89b1325
fj-y-saito ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/…
220d0b3d
JohannesGaessler CUDA: backwards pass for misc. ops, add tests (llama/11257)
0965fe92
jeffbolznv vulkan: optimize coopmat2 q2_k dequant function (llama/11130)
2e452fee
jeffbolznv vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)
aa8adfa7
jeffbolznv vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (lla…
b956395f
rgerganov rpc : early register backend devices (llama/11262)
315a2247
jeffbolznv vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama…
529b36e3
ggerganov cmake : add sanitizer flags for llama.cpp (llama/11279)
1213918a
s-Nick SYCL: Introducing memory host pool (llama/11251)
0b94a0b2
jeffbolznv vulkan: fix coopmat2 validation failures (llama/11284)
9cf3fdd7
ggerganov metal : fix out-of-bounds write (llama/11314)
f23199cf
rgerganov rpc : better caching of the base buffer pointer (llama/11331)
939144c5
jeffbolznv vulkan: fix diag_mask_inf (llama/11323)
596288cd
jeffbolznv vulkan: sort shaders for more deterministic binary (llama/11315)
7a137f9b
AMD-dwang Vulkan-run-test: fix mmq_wg_denoms (llama/11343)
f2826733
cmake : avoid -march=native when reproducible build is wanted (llama/…
276cc41f
JohannesGaessler CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
4e6471e3
IMbackK rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356)
9789f26c
JohannesGaessler CUDA: fix FP16 cuBLAS GEMM (llama/11396)
c9a1d359
IMbackK hip : Add hipGraph and VMM support to ROCM (llama/11362)
41cd5dbe
IMbackK Hip: disable VMM on hip as it seams that it dosent work in some confi…
d5376335
jeffbolznv vulkan: compile shaders on-demand (llama/11406)
3f911fa7
mtmcp cmake: add ggml find package (llama/11369)
36473b82
ggerganov metal : use residency sets (llama/11427)
48ca09de
metal: Handle null returned from MTLCreateSystemDefaultDevice() (llam…
94dbfb2f
Haus1 AMD: parse the architecture as supplied by gcnArchName (llama/11244)
cd000b56
qnixsynapse SYCL : SOFTMAX F16 mask support and other fixes (llama/11261)
a60c461f
someone13574 cmake : don't fail on `GGML_CPU=OFF` (llama/11457)
3a06dc46
sARY77 HIP: Only call rocblas_initialize on rocblas versions with the multip…
7664f6c4
IMbackK HIP: Supress transformation warning in softmax.cu
76355dd0
jeffbolznv vulkan: Catch pipeline creation failure and print an error message (l…
b2c7108c
remyoudompheng vulkan: implement initial support for IQ2 and IQ3 quantizations (llam…
2178b0c1
IMbackK CUDA/HIP: add warp_size to cuda_device_info
0ef996a6
IMbackK HIP: Prepare reduction operators for wave 64
1c036478
IMbackK HIP: require at least HIP 5.5
69bbc392
ochafik `ci`: use sccache on windows instead of ccache (llama/11545)
f157854b
JohannesGaessler CUDA: use mma PTX instructions for FlashAttention (llama/11583)
8b531bae
IMbackK HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectu…
86bc5ad1
IMbackK CUDA/HIP: add support for selectable warp size to mmv (llama/11519)
11400236
JohannesGaessler HIP: fix flash_attn_stream_k_fixup warning (llama/11604)
4ca2fcd0
JohannesGaessler CUDA: fix Volta FlashAttention logic (llama/11615)
f2fb21f4
ggerganov scripts : fix sync paths
4e570fd5
ggerganov sync : ggml
2542c1ac
ggerganov
ggerganov cmake : sync cmake scripts
6086e987
ggerganov ci : use ubuntu-22.04 instead of ubuntu-latest
94bad053
ggerganov ggerganov force pushed from 323e0069 to 94bad053 1 year ago
ggerganov ci : install git
cbf347b9
ggerganov ci : more git
0ab5ba0c
ggerganov ggerganov merged 90e3c5fc into master 1 year ago
ggerganov ggerganov deleted the sync-ggml-25-02-03 branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone