whisper.cpp
sync : ggml
#2779
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
53
Changes
View On
GitHub
sync : ggml
#2779
ggerganov
merged 53 commits into
master
from
sync-ggml-25-02-03
ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. …
ab7df17f
ggml : add option to not print stack on abort (ggml/1081)
eba54c73
SYCL: Add gated linear attention kernel (llama/11175)
bda34512
RoPE: fix back, CUDA support for back + noncont. (llama/11240)
11eb7f4b
fix: ggml: fix vulkan-shaders-gen build (llama/10448)
3a1f8339
vulkan: scale caching for k quants + misc fixes (llama/11081)
a89b1325
ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/…
220d0b3d
CUDA: backwards pass for misc. ops, add tests (llama/11257)
0965fe92
vulkan: optimize coopmat2 q2_k dequant function (llama/11130)
2e452fee
vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)
aa8adfa7
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (lla…
b956395f
rpc : early register backend devices (llama/11262)
315a2247
vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama…
529b36e3
cmake : add sanitizer flags for llama.cpp (llama/11279)
1213918a
SYCL: Introducing memory host pool (llama/11251)
0b94a0b2
vulkan: fix coopmat2 validation failures (llama/11284)
9cf3fdd7
metal : fix out-of-bounds write (llama/11314)
f23199cf
rpc : better caching of the base buffer pointer (llama/11331)
939144c5
vulkan: fix diag_mask_inf (llama/11323)
596288cd
vulkan: sort shaders for more deterministic binary (llama/11315)
7a137f9b
Vulkan-run-test: fix mmq_wg_denoms (llama/11343)
f2826733
cmake : avoid -march=native when reproducible build is wanted (llama/…
276cc41f
CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)
4e6471e3
rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356)
9789f26c
CUDA: fix FP16 cuBLAS GEMM (llama/11396)
c9a1d359
hip : Add hipGraph and VMM support to ROCM (llama/11362)
41cd5dbe
Hip: disable VMM on hip as it seams that it dosent work in some confi…
d5376335
vulkan: compile shaders on-demand (llama/11406)
3f911fa7
cmake: add ggml find package (llama/11369)
36473b82
metal : use residency sets (llama/11427)
48ca09de
metal: Handle null returned from MTLCreateSystemDefaultDevice() (llam…
94dbfb2f
AMD: parse the architecture as supplied by gcnArchName (llama/11244)
cd000b56
SYCL : SOFTMAX F16 mask support and other fixes (llama/11261)
a60c461f
cmake : don't fail on `GGML_CPU=OFF` (llama/11457)
3a06dc46
HIP: Only call rocblas_initialize on rocblas versions with the multip…
7664f6c4
HIP: Supress transformation warning in softmax.cu
76355dd0
vulkan: Catch pipeline creation failure and print an error message (l…
b2c7108c
vulkan: implement initial support for IQ2 and IQ3 quantizations (llam…
2178b0c1
CUDA/HIP: add warp_size to cuda_device_info
0ef996a6
HIP: Prepare reduction operators for wave 64
1c036478
HIP: require at least HIP 5.5
69bbc392
`ci`: use sccache on windows instead of ccache (llama/11545)
f157854b
CUDA: use mma PTX instructions for FlashAttention (llama/11583)
8b531bae
HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectu…
86bc5ad1
CUDA/HIP: add support for selectable warp size to mmv (llama/11519)
11400236
HIP: fix flash_attn_stream_k_fixup warning (llama/11604)
4ca2fcd0
CUDA: fix Volta FlashAttention logic (llama/11615)
f2fb21f4
scripts : fix sync paths
4e570fd5
sync : ggml
2542c1ac
cmake : sync cmake scripts
6086e987
ci : use ubuntu-22.04 instead of ubuntu-latest
94bad053
ggerganov
force pushed
from
323e0069
to
94bad053
1 year ago
ci : install git
cbf347b9
ci : more git
0ab5ba0c
ggerganov
merged
90e3c5fc
into master
1 year ago
ggerganov
deleted the sync-ggml-25-02-03 branch
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub