PR #2868 sync : ggml - SemanticDiff

sync : ggml #2868

ggerganov merged 38 commits into master from sync-ggml-25-03-08

ggerganov

petterreinholdtsen

Told cmake to install ggml-cpp.h as a public header file. (ggml/1126)

e1df78f2

cmdr2

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)

0ad15e3f

cmdr2

cuda/vulkan: specify fp32-only support for some operations in support…

a34489ce

cmdr2

cuda: unary ops as float + de-duplicate (ggml/1130)

0c02f62d

MollySophia

ggml-cpu: Fix build with sve (llama/12059)

0894863c

jeffbolznv

vulkan: fix assertion when qy_needs_dequant (llama/12068)

efe4017b

vvuksanovic

cmake: Fix ggml backend dependencies and installation (llama/11818)

6e7f839f

daniandtheweb

vulkan: improve im2col (llama/11826)

c270cead

netrunnereve

vulkan: matmul dequantization improvements (llama/12015)

cd143b89

hipudding

CANN: Fix build error with GCC 13 (llama/11990)

a252113f

Vithulep

ggml: aarch64: implement SVE kernels for q2_k_q8_k vector dot (llama/…

be928d35

JohannesGaessler

CUDA: fix logic for V100 + GGML_CUDA_FORCE_MMQ (llama/12098)

00b059e1

remyoudompheng

vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizatio…

4f45b431

WilliamTambellini

ggml : upgrade init_tensor API to return a ggml_status (llama/11854)

27243464

Green-Sky

CUDA: compress mode option and default to size (llama/12029)

85e04616

slaren

ggml-backend : keep paths in native string type when possible (llama/…

aa27e01b

qnixsynapse

SYCL: Move CPY kernels to a separate file and add few missing kernels…

51679796

ag2s20150909

ggml : fix kleidiai build (llama/12159)

79146701

hjc4869

HIP: implement FlashAttention via rocWMMA for CDNA and RDNA3+ (llama/…

9a517d61

mgroeber9110

ggml : portability fixes for VS 2017 (llama/12150)

e38d29cc

ggerganov

vulkan : sync (llama/0)

66e42fa3

vmobilis

ggml : ggml_compute_forward_concat() for arbitrary tensor type (ggml/…

ea027a44

pminev

ggml : fix GGMLMetalClass ODR (llama/12200)

6979ba82

qnixsynapse

SYCL: Disable f16 Unary OPs as not supported by the kernels (llama/12…

c19d8eb2

remyoudompheng

ggml-cpu: Faster IQ1 mul_mat_vec on AVX2 using BMI2 instructions (lla…

2b2c567f

simon886212

opencl : fix profile-related errors (llama/12095)

dccfba4c

linehill

opencl : fix `ulong` kernel args were set from `int` variables (llama…

03b31a0c

linehill

opencl : fix buffer alignment (llama/12197)

c6e70f7e

IMbackK

HIP/CUDA: set the paramerter value in maintain_cuda_graph instead of …

8a3a93e6

JohannesGaessler

CUDA: fix FA logic for PTX 7.0 and CC >= 7.5 (llama/12222)

425cefdf

hbuxiaofei

cmake : fix undefined reference errors for std::filesystem in ggml (#…

f7551669

lhez

opencl: Noncontiguous `norm`, `rms_norm`, disable `fp16` for some ops…

dbf9384f

danbev

metal : fix default.metallib build (llama/12224)

41079a9a

BB-fat

metal : simplify kernel arguments using a struct (ggml/3229) (llama/1…

5141a2c9

remyoudompheng

ggml-cpu: faster AVX2 variant for IQ1_M (llama/12216)

16754cd2

ggerganov

sync : ggml

0b4956d1

ggerganov

cmake : fix ggml-config (ggml/0)

ed5c4942

ggerganov

objc : fix build, tmp remove GPU support, use C++17

209e1f37

ggerganov

ggerganov force pushed from a102b03e to 209e1f37 1 year ago

ggerganov

ggerganov merged 7d140057 into master 1 year ago

ggerganov

ggerganov deleted the sync-ggml-25-03-08 branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Milestone

No milestone