ngxson/llama.cpp

Pull Requests Commits

Initialize default slot sampling parameters from the global context. (#8418)

HanClinto committed 1 year ago

Verified 278d0e18

Name Migration: Build the deprecation-warning 'main' binary every time (#8404)

HanClinto committed 1 year ago

Verified dd07a123

[SYCL] Use multi_ptr to clean up deprecated warnings (#8256)

AidanBeltonS committed 1 year ago

Verified f4444d99

ggml : move sgemm sources to llamafile subfolder (#8394)

ggerganov committed 1 year ago

Verified 6b2a849d

ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (#5780)

Dibakar committed 1 year ago

Verified 0f1a39f3

gguf-py rel pipeline (#8410)

monatis committed 1 year ago

Verified 83321c69

llama : C++20 compatibility for u8 strings (#8408)

iboB committed 1 year ago

Verified cc61948b

msvc : silence codecvt c++17 deprecation warnings (#8395)

iboB committed 1 year ago

Verified 7a80710d

llama : add assert about missing llama_encode() call (#8400)

fairydreaming committed 1 year ago

Verified a8be1e6f

py : fix converter for internlm2 (#8321)

RunningLeon committed 1 year ago

Verified e4dd31ff

py : fix extra space in convert_hf_to_gguf.py (#8407)

laik committed 1 year ago

Verified 8f0fad42

Server: Enable setting default sampling parameters via command-line (#8402)

HanClinto committed 1 year ago

Verified a59f8fdc

Update README.md to fix broken link to docs (#8399)

andysalerno committed 1 year ago

Verified fd560fe6

Deprecation warning to assist with migration to new binary names (#8283)

HanClinto committed 1 year ago

Verified e500d613

make/cmake: LLAMA_NO_CCACHE -> GGML_NO_CCACHE (#8392)

JohannesGaessler committed 1 year ago

Verified a03e8dd9

sycl : Reenabled mmvq path for the SYCL Nvidia Backend (#8372)

Alberto Cabrera Pérez committed 1 year ago

Verified 5b0b8d8c

cmake : allow external ggml (#8370)

iboB committed 1 year ago

Verified 9925ca40

readme : fix typo [no ci] (#8389)

daghanerdonmez committed 1 year ago

Verified 9beb2dda

gguf-py : do not use internal numpy types (#7472)

compilade committed 1 year ago

Verified 7d0e23d7

flake.lock: Update (#8342)

ggerganov committed 1 year ago

Verified 7fdb6f73

labeler : updated sycl to match docs and code refactor (#8373)

Alberto Cabrera Pérez committed 1 year ago

Verified a130ecce

readme : fix web link error [no ci] (#8347)

b4b4o committed 1 year ago

Verified c4dd11d1

sycl : fix powf call in device code (#8368)

Alberto Cabrera Pérez committed 1 year ago

Verified 2ec846d5

scripts : fix sync for sycl

ggerganov committed 1 year ago

Verified 3f2d538b

ggerganov committed 1 year ago

2ee44c9a

tests : fix whitespace (#0)

ggerganov committed 1 year ago

6847d54c

feat: cuda implementation for `ggml_conv_transpose_1d` (ggml/854)

balisujohn committed 1 year ago

fde13b3b

common : preallocate sampling token data vector (#8363)

kevmo314 committed 1 year ago

Verified 470939d4

infill : assert prefix/suffix tokens + remove old space logic (#8351)

ggerganov committed 1 year ago

Verified 6f0dbf6a

common : avoid unnecessary logits fetch (#8358)

kevmo314 committed 1 year ago

Verified ffd00797

Newer Older