ochafik/llama.cpp

Pull Requests Commits

generate more tokens in test_completion_with_required_tool_tiny_fast to avoid truncation

ochafik committed 1 year ago

5fb51053

Merge branch 'date' of github.com:ochafik/llama.cpp into date

ochafik committed 1 year ago

da8e9e23

Merge remote-tracking branch 'origin/master' into date

ochafik committed 1 year ago

e7003aa7

Update common/chat.cpp

ochafik committed 1 year ago

Verified d200b7c9

CUDA: faster Deepseek FA, add Turing support (#13435)

JohannesGaessler committed 1 year ago

Verified 6da34fa2

fix: Move build_inp_pos to the top of the graph section for build_granite (#13538)

gabe-l-hart committed 1 year ago

Verified 5e7d95e2

server : passthrough the /models endpoint during loading (#13535)

ggerganov committed 1 year ago

Verified 05317443

server : fix cache_tokens bug with no cache_prompt (#13533)

ngxson committed 1 year ago

Verified 360a9c98

cmake: simplify vulkan shader test logic (#13263)

mtmcp committed 1 year ago

Verified 09d13d94

vulkan: KHR_coopmat flash attention (#13506)

jeffbolznv committed 1 year ago

Verified 24e86cae

webui : use fflate for more deterministic gzip compress (#13525)

ngxson committed 1 year ago

Verified bb1681fb

webui: Allow pasting file from clipboard (#13526)

luca020400 committed 1 year ago

Verified d486dd3e

docs: Update link to ggml-org in multimodal.md (#13513)

ddpasa committed 1 year ago

Verified 21ca987f

scripts : fix compare-llama-bench.py show parameter (#13514)

CISC committed 1 year ago

Verified be1d4a13

vulkan: workaround FA compile failures on macos (#13517)

jeffbolznv committed 1 year ago

Verified ab3971f2

quantize : improve tensor-type pattern matching (#13033)

EAddario committed 1 year ago

Verified e5c834f7

clip : clip.h become private API (⚠️ breaking change) (#13510)

ngxson committed 1 year ago

Verified 71bdbdb5

metal : use FA-vec kernel up to batch size 20 (#13496)

ggerganov committed 1 year ago

Verified f0995d28

metal : optimize multi-sequence FA vec kernel (#13493)

ggerganov committed 1 year ago

Verified c252e0c4

ggml-cpu: Update KleidiAI to v1.6 and fix include directives (#13509)

eddnjjn committed 1 year ago

Verified 4f711afe

batched-bench : fix pp batch contents (#13492)

ggerganov committed 1 year ago

Verified b89d605a

mtmd : remove libllava, remove clip-quantize-cli (⚠️ breaking change) (#13460)

ngxson committed 1 year ago

Verified b4726345

scripts : support arbitrary input file formats in compare-llama-bench.py (#13455)

CISC committed 1 year ago

Verified bf793711

model : Granite MoE shared (#13269)

gabe-l-hart committed 1 year ago

Verified d590cd4c

ggerganov committed 1 year ago

1e2809bc

llama-bench : add defrag-thold, check for invalid ranges (#13487)

slaren committed 1 year ago

Verified cf0a43bb

opencl: remove unnecessary assert for `add` (#13257)

lhez committed 1 year ago

Verified f0d46ef1

clip : cap max image size 1024 for qwen vl model (#13478)

ngxson committed 1 year ago

Verified de4c07f9

llama/ggml: add LLM training support (#10544)

JohannesGaessler committed 1 year ago

Verified 10d2af0e

context : fix state io for memory-less contexts (#13470)

ggerganov committed 1 year ago

Verified 064cc596

Older