ochafik/llama.cpp

Pull Requests Commits

Merge branch 'json-fixes' into json-fixes-refact

ochafik committed 2 years ago

25c756c4

Create regex-to-grammar.py

ochafik committed 2 years ago

add8fee0

Update json-schema-to-grammar.py

ochafik committed 2 years ago

140deb39

json: experimental refactoring w/ intermediate schema & rule objects

ochafik committed 2 years ago

09dd3a76

Update json-schema-to-grammar.mjs

ochafik committed 2 years ago

660e8321

json: handle pattern repetitions

ochafik committed 2 years ago

4e7c26c3

json: merge lit sequences and handle negatives

ochafik committed 2 years ago

d5ef412f

json: fix _format_literal (json.dumps already escapes quotes)

ochafik committed 2 years ago

a78eb4a0

Create ts-type-to-grammar.sh

ochafik committed 2 years ago

21ac451d

ochafik committed 2 years ago

06b04e93

Merge remote-tracking branch 'origin/master' into json-fixes

ochafik committed 2 years ago

be132472

cuda : fix data race in soft max (#5853)

slaren committed 2 years ago

Verified 67be2ce1

readme : add API changes section

ggerganov committed 2 years ago

Verified 231ae28f

llama : allow for user specified embedding pooling type (#5849)

iamlemec committed 2 years ago

Verified 475df1d6

gguf-dump : support i-quants (#5841)

Nindaleth committed 2 years ago

Verified 87c2e8b2

llama : fix llama_copy_state_data with fragmented KV cache (#5840)

compilade committed 2 years ago

Verified de9692a7

ci : schedule slow server tests only on Release or on demand (#5839)

phymbert committed 2 years ago

Verified e6029348

server : init http requests thread pool with --parallel if set (#5836)

phymbert committed 2 years ago

Verified 8ef969af

flake.lock: Update (#5842)

ggerganov committed 2 years ago

Verified fa974646

server: tests: passkey challenge / self-extend with context shift demo (#5832)

phymbert committed 2 years ago

Verified 97311342

llama : add abort_callback to interrupt computation (#5409)

Xarbirus committed 2 years ago

Verified 4a6e2d61

ggml : fix IQ3_S AVX implementation (#5834)

ggerganov committed 2 years ago

Verified 494c8703

convert : automatically fall back to HfVocab if tokenizer.model doesn't exist (#5821)

cebtenzzre committed 2 years ago

Verified 4d4d2366

convert-hf : make model class definitions self-contained (#5825)

cebtenzzre committed 2 years ago

Verified c7a0ad8e

ggml : IQ3_S improvements (#5829)

ikawrakow committed 2 years ago

Verified bbde6eb2

scripts : add pod-llama.sh

ggerganov committed 2 years ago

Verified ef2cd694

llama : refactor internal quantization functions (#5830)

ngxson committed 2 years ago

Verified 6c32d8c7

llama : fix segfault from unknown model arch name (#5820)

compilade committed 2 years ago

Verified 802da009

Support multiple GPUs (split mode) on SYCL backend (#5806)

NeoZhangJianyu committed 2 years ago

Verified 71564139

workflows : remove nocleanup arg for check-requirements.sh (#5826)

crasm committed 2 years ago

Verified 9bf297a0

Older