PR #2779 sync : ggml - SemanticDiff

ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. …

ab7df17f

ggml : add option to not print stack on abort (ggml/1081)

eba54c73

SYCL: Add gated linear attention kernel (llama/11175)

bda34512

RoPE: fix back, CUDA support for back + noncont. (llama/11240)

11eb7f4b

fix: ggml: fix vulkan-shaders-gen build (llama/10448)

3a1f8339

vulkan: scale caching for k quants + misc fixes (llama/11081)

a89b1325

ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/…

220d0b3d

CUDA: backwards pass for misc. ops, add tests (llama/11257)

0965fe92

vulkan: optimize coopmat2 q2_k dequant function (llama/11130)

2e452fee

vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206)

aa8adfa7

vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (lla…

b956395f

rpc : early register backend devices (llama/11262)

315a2247

vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama…

529b36e3

cmake : add sanitizer flags for llama.cpp (llama/11279)

1213918a

SYCL: Introducing memory host pool (llama/11251)

0b94a0b2

vulkan: fix coopmat2 validation failures (llama/11284)

9cf3fdd7

metal : fix out-of-bounds write (llama/11314)

f23199cf

rpc : better caching of the base buffer pointer (llama/11331)

939144c5

vulkan: fix diag_mask_inf (llama/11323)

596288cd

vulkan: sort shaders for more deterministic binary (llama/11315)

7a137f9b

Vulkan-run-test: fix mmq_wg_denoms (llama/11343)

f2826733

cmake : avoid -march=native when reproducible build is wanted (llama/…

276cc41f

CPU/CUDA: fix (GQA) mul mat back, add CUDA support (llama/11380)

4e6471e3

rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (llama/11356)

9789f26c

CUDA: fix FP16 cuBLAS GEMM (llama/11396)

c9a1d359

hip : Add hipGraph and VMM support to ROCM (llama/11362)

41cd5dbe

Hip: disable VMM on hip as it seams that it dosent work in some confi…

d5376335

vulkan: compile shaders on-demand (llama/11406)

3f911fa7

cmake: add ggml find package (llama/11369)

36473b82

metal : use residency sets (llama/11427)

48ca09de

metal: Handle null returned from MTLCreateSystemDefaultDevice() (llam…

94dbfb2f

AMD: parse the architecture as supplied by gcnArchName (llama/11244)

cd000b56

SYCL : SOFTMAX F16 mask support and other fixes (llama/11261)

a60c461f

cmake : don't fail on `GGML_CPU=OFF` (llama/11457)

3a06dc46

HIP: Only call rocblas_initialize on rocblas versions with the multip…

7664f6c4

HIP: Supress transformation warning in softmax.cu

76355dd0

vulkan: Catch pipeline creation failure and print an error message (l…

b2c7108c

vulkan: implement initial support for IQ2 and IQ3 quantizations (llam…

2178b0c1

CUDA/HIP: add warp_size to cuda_device_info

0ef996a6

HIP: Prepare reduction operators for wave 64

1c036478

HIP: require at least HIP 5.5

69bbc392

`ci`: use sccache on windows instead of ccache (llama/11545)

f157854b

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

8b531bae

HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectu…

86bc5ad1

CUDA/HIP: add support for selectable warp size to mmv (llama/11519)

11400236

HIP: fix flash_attn_stream_k_fixup warning (llama/11604)

4ca2fcd0

CUDA: fix Volta FlashAttention logic (llama/11615)

f2fb21f4

scripts : fix sync paths

4e570fd5

sync : ggml

2542c1ac

cmake : sync cmake scripts

6086e987

ci : use ubuntu-22.04 instead of ubuntu-latest

94bad053

ggerganov force pushed from 323e0069 to 94bad053 1 year ago

ci : install git

cbf347b9

ci : more git

0ab5ba0c

ggerganov merged 90e3c5fc into master 1 year ago

ggerganov deleted the sync-ggml-25-02-03 branch 1 year ago

whisper.cpp
sync : ggml
#2779

Merged

sync : ggml #2779

whisper.cpp sync : ggml #2779 Merged

sync : ggml #2779

whisper.cpp
sync : ggml
#2779

Merged