ochafik/llama.cpp

Pull Requests Commits

Merge remote-tracking branch 'SlyEcho/hipblas' into skip-unused-hipblas

Olivier Chafik committed 2 years ago

cb6fd320

skip-unused: disable skipping on ROCm / when LLAMA_USE_HIPBLAS

ochafik committed 2 years ago

7ec7ef94

Skip computation of unused logits during batch prompt eval (drop other batch positions after writing their kv to cache)

ochafik committed 2 years ago

2cf4f62e

fix -nommq help for non CUDA/HIP

SlyEcho committed 2 years ago

Verified 238335f5

fix llama-bench

SlyEcho committed 2 years ago

Verified 81ecaa4b

Add Dockerfiles

SlyEcho committed 2 years ago

Verified a60231f7

ignore all build dirs

SlyEcho committed 2 years ago

Verified 058f905e

Merge 'origin/master' into hipblas

SlyEcho committed 2 years ago

Verified 7b842170

llama : escape all U+2581 in a string (#2750)

ggerganov committed 2 years ago

Verified c3e53b42

llama : fix grammar sometimes generating null char (#2756)

ejones committed 2 years ago

Verified 6e91a1b0

readme : fix link

ggerganov committed 2 years ago

Verified 44d5462b

minor : fix trailing whitespace

ggerganov committed 2 years ago

Verified c7868b07

readme : update hot topics

ggerganov committed 2 years ago

Verified 79da24b5

llm : add Falcon support (#2717)

ggerganov committed 2 years ago

Verified cf658adc

minor : fix trailing whitespace

ggerganov committed 2 years ago

Verified a192860c

examples : restore the functionality to import llama2.c models (#2685)

ochafik committed 2 years ago

Verified 95385241

fix convert-lora-to-ggml.py (#2738)

slaren committed 2 years ago

Verified 335acd2f

main : insert bos if no tokens (#2727)

klosax committed 2 years ago

Verified 5290c38e

gitignore : fix for windows (#2729)

akawrykow committed 2 years ago

Verified cc34dbda

chmod : make scripts executable (#2675)

cebtenzzre committed 2 years ago

Verified 7c2227a1

devops : RPM Specs (#2723)

jboero committed 2 years ago

Verified f19dca04

Fix values shown in the quantize tool help (#2735)

ikawrakow committed 2 years ago

Verified 8207214b

Strided perplexity (#2714)

ikawrakow committed 2 years ago

Verified 62959e74

Fix ggml to gguf conversion on Windows (#2733)

IgnacioFDM committed 2 years ago

Verified 7f7ddd50

server : allow json array in prompt or content for direct token input (#2306)

jxy committed 2 years ago

Verified b8ad1b66

docs : add grammar docs (#2701)

ejones committed 2 years ago

Verified f5fe98d1

Improve handling of special tokens in GGML to GGUF converter (#2725)

KerfuffleV2 committed 2 years ago

Verified 777f42ba

llama : fix whitespace escaping in tokenizer (#2724)

goerch committed 2 years ago

Verified 46ef5b5f

CUDA: use mul_mat_q kernels by default (#2683)

JohannesGaessler committed 2 years ago

Verified c63bb1d1

convert.py : clarifying error message (#2718)

apetenchea committed 2 years ago

Verified 3b6cfe7c

Older