ochafik/llama.cpp

Pull Requests Commits

Update utils.hpp

ochafik committed 1 year ago

333767de

Update test_chat_completion.py

ochafik committed 1 year ago

f8b3f8d4

Update test_chat_completion.py

ochafik committed 1 year ago

1159022d

Better compat w/ OAI json_schema response type

ochafik committed 1 year ago

7ad6687d

vulkan: im2col and matmul optimizations for stable diffusion (#10942)

jeffbolznv committed 1 year ago

Verified a813badb

vulkan: Use push constant offset to handle misaligned descriptors (#10987)

jeffbolznv committed 1 year ago

Verified fdd21889

server: added more docs for response_fields field (#10995)

isaac-mcfadyen committed 1 year ago

Verified f865ea14

server : fix token duplication when streaming with stop strings (#10997)

z80maniac committed 1 year ago

Verified 16cdce7b

vulkan: multi-row k quants (#10846)

netrunnereve committed 1 year ago

Verified d79d8f39

examples, ggml : fix GCC compiler warnings (#10983)

peter277 committed 1 year ago

Verified d283d02b

server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967)

elk-cloner committed 1 year ago

Verified 9ba399df

ggml : more perfo with llamafile tinyblas on x86_64 (#10714)

Djip007 committed 1 year ago

Verified 2cd43f49

server: allow filtering llama server response fields (#10940)

nvrxq committed 1 year ago

Verified 09fe2e76

llama : the WPM vocabs use the CLS token as BOS (#10930)

ggerganov committed 1 year ago

Verified 30caac3a

ggml : use wstring for backend search paths (#10960)

slaren committed 1 year ago

Verified 60cfa728

ggml : fix arm enabled features check (#10961)

slaren committed 1 year ago

Verified 3327bb0f

ggml : fix const usage in SSE path (#10962)

slaren committed 1 year ago

Verified 32d6ee63

server : fix missing model id in /model endpoint (#10957)

ngxson committed 1 year ago

Verified 14b699ec

server : add system_fingerprint to chat/completion (#10917)

ngxson committed 1 year ago

Verified 485dc012

rpc-server : add support for the SYCL backend (#10934)

rgerganov committed 1 year ago

Verified 86bf31cf

llama : support InfiniAI Megrez 3b (#10893)

dixyes committed 1 year ago

Verified b92a14a8

llama : support for Llama-3_1-Nemotron-51B (#10669)

ymcki committed 1 year ago

Verified 6f0c9e03

llama-run : include temperature option (#10899)

ericcurtin committed 1 year ago

Verified dab76c92

ggml : fix run-time on FreeBSD in get_executable_path() (#10948)

yurivict committed 1 year ago

Verified 7024d59e

devops : add docker-multi-stage builds (#10832)

rudiservo committed 1 year ago

Verified 7c0e2858

llama : add Falcon3 support (#10883)

mokeddembillel committed 1 year ago

Verified 7ae33a61

vulkan: build fixes for 32b (#10927)

jeffbolznv committed 1 year ago

Verified ebdee947

convert : add BertForMaskedLM (#10919)

ggerganov committed 1 year ago

Verified 5cd85b5e

vulkan: optimize coopmat2 dequant functions (#10855)

jeffbolznv committed 1 year ago

Verified a91a4136

ggml-cpu: replace NEON asm with intrinsics in ggml_gemv_q4_0_4x8_q8_0() (#10874)

angt committed 1 year ago

Verified e34c5af4

Older