ochafik/llama.cpp

Pull Requests Commits

fix json include

ochafik committed 1 year ago

ef5512d4

ochafik committed 1 year ago

36280603

ci : add env variable in ggml-ci and document the same in SYCL.md (#12736)

AD2605 committed 1 year ago

Verified 2004644b

sync : minja (inclusionAI/Ling) and update tests (#12699)

yeahdongcn committed 1 year ago

Verified 5f696e88

fix MUSA compiler warning (#12704)

A3shTnT committed 1 year ago

Verified 193c3e03

CANN: Support operator SIN COS ARGMAX (#12709)

noemotiovon committed 1 year ago

Verified 65cfe136

Simplify and improve CUDA graphs through use of indirect copy pointers (#9017)

agray3 committed 1 year ago

Verified 3f9da22c

CANN: Fix failed test cases (#12708)

hipudding committed 1 year ago

Verified 2a0dc97e

opencl: use `max_alloc_size` in backend ctx instead of querying again (#12705)

lhez committed 1 year ago

Verified 97a20c01

vulkan: Implement split_k for coopmat2 flash attention. (#12627)

jeffbolznv committed 1 year ago

Verified f01bd023

cmake: remove caching from vulkan coopmat checks (#12719)

mtmcp committed 1 year ago

Verified 6f3bd386

vulkan: Implement grouped query attention in the coopmat2 FA shader (#12559)

jeffbolznv committed 1 year ago

Verified be0a0f8c

Vulkan: Fix mmq int dot float cache size (#12722)

0cc4m committed 1 year ago

Verified 92e3006b

model : print tensor size during load (#12711)

ggerganov committed 1 year ago

Verified 833e2b74

llama : add option to override model tensor buffers (#11397)

slaren committed 1 year ago

Verified e0e912f4

llama : refactor kv cache guard (#12695)

ggerganov committed 1 year ago

Verified a10b36c9

vocab : BailingMoE : change possessive quantifiers to greedy (#12677)

CISC committed 1 year ago

Verified 83a88bd6

common : remove json.hpp from common.cpp (#12697)

ngxson committed 1 year ago

Verified 42eb248f

[CANN] get_rows and dup optimization (#12671)

noemotiovon committed 1 year ago

Verified 9bacd6b3

common : refactor downloading system, handle mmproj with -hf option (#12694)

ngxson committed 1 year ago

Verified 267c1399

opencl : fix memory allocation size (#12649)

sparkleholic committed 1 year ago

Verified f423981a

llama : use LLM_KV_GENERAL_FILE_TYPE instead of gguf_find_key (#12672)

jklincn committed 1 year ago

Verified e39e727e

convert : BailingMoE : fix qkv split when head_dim is 0 (#12687)

CISC committed 1 year ago

Verified 5936a616

metal : use F32 prec in FA kernels (#12688)

ggerganov committed 1 year ago

Verified 3fd072a5

Fix clang warning in gguf_check_reserved_keys (#12686)

yeahdongcn committed 1 year ago

Verified a6f32f0b

vulkan: fix build when glslc doesn't support coopmat (#12683)

wbruna committed 1 year ago

Verified 2bb3597e

SYCL: Rename oneMKL to oneMath (#12192)

Rbiessy committed 1 year ago

Verified 82939705

SYCL: switch to SYCL namespace (#12674)

qnixsynapse committed 1 year ago

Verified 8bbf2608

convert : BailingMoE : avoid setting rope_dim to 0 (#12678)

CISC committed 1 year ago

Verified 35782aee

vocab : add special infill tokens for CodeLlama (#11850)

danbev committed 1 year ago

Verified c80a7759

Older