ggerganov/llama.cpp

Pull Requests Commits

pydantic : fix Python 3.9 and 3.10 support

compilade committed 1 year ago

f89eaa92

pydantic : replace uses of __annotations__ with get_type_hints

compilade committed 1 year ago

eed299f0

vulkan : cmake integration (#8119)

mtmcp committed 1 year ago

Verified 17eb6aa8

metal : template-ify some of the kernels (#8447)

ggerganov committed 1 year ago

Verified c917b67f

server : handle content array in chat API (#8449)

ggerganov committed 1 year ago

Verified 4e24cffd

main : print error on empty input (#8456)

ggerganov committed 1 year ago

Verified 6af51c0d

llama : suppress unary minus operator warning (#8448)

danbev committed 1 year ago

Verified f5322624

server : ensure batches are either all embed or all completion (#8420)

iamlemec committed 1 year ago

Verified c3ebcfa1

docker : fix filename for convert-hf-to-gguf.py in tools.sh (#8441)

kriation committed 1 year ago

Verified 8a4441ea

convert : remove fsep token from GPTRefactForCausalLM (#8237)

jpodivin committed 1 year ago

Verified 5aefbce2

examples : sprintf -> snprintf (#8434)

ggerganov committed 1 year ago

Verified 71c1121d

ggml : minor naming changes (#8433)

ggerganov committed 1 year ago

Verified 370b1f7e

[SYCL] fix the mul_mat_id ut issues (#8427)

ClarkChin08 committed 1 year ago

Verified b549a1bb

ggml : add NVPL BLAS support (#8329) (#8425)

nicholaiTukanov committed 1 year ago

Verified 36864569

cuda : suppress 'noreturn' warn in no_device_code (#8414)

danbev committed 1 year ago

Verified b078c619

CUDA: optimize and refactor MMQ (#8416)

JohannesGaessler committed 1 year ago

Verified 808aba39

gitignore : deprecated binaries

ggerganov committed 1 year ago

Verified a977c115

tokenize : add --no-parse-special option (#8423)

compilade committed 1 year ago

Verified 9a55ffe6

llama : use F32 precision in Qwen2 attention and no FA (#8412)

ggerganov committed 1 year ago

Verified 7a221b67

Initialize default slot sampling parameters from the global context. (#8418)

HanClinto committed 1 year ago

Verified 278d0e18

Name Migration: Build the deprecation-warning 'main' binary every time (#8404)

HanClinto committed 1 year ago

Verified dd07a123

[SYCL] Use multi_ptr to clean up deprecated warnings (#8256)

AidanBeltonS committed 1 year ago

Verified f4444d99

ggml : move sgemm sources to llamafile subfolder (#8394)

ggerganov committed 1 year ago

Verified 6b2a849d

ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780)

Dibakar committed 1 year ago

Verified 0f1a39f3

gguf-py rel pipeline (#8410)

monatis committed 1 year ago

Verified 83321c69

llama : C++20 compatibility for u8 strings (#8408)

iboB committed 1 year ago

Verified cc61948b

msvc : silence codecvt c++17 deprecation warnings (#8395)

iboB committed 1 year ago

Verified 7a80710d

llama : add assert about missing llama_encode() call (#8400)

fairydreaming committed 1 year ago

Verified a8be1e6f

py : fix converter for internlm2 (#8321)

RunningLeon committed 1 year ago

Verified e4dd31ff

py : fix extra space in convert_hf_to_gguf.py (#8407)

laik committed 1 year ago

Verified 8f0fad42

Older