ggerganov/llama.cpp

Pull Requests Commits

Vaibhavs10 committed 360 days ago

99619529

Vaibhavs10 committed 363 days ago

97c64a09

Update the graph.

Vaibhavs10 committed 1 year ago

6201b438

fix errors in conversion.

Vaibhavs10 committed 1 year ago

02ff0850

Model -> ModelBase.

Vaibhavs10 committed 1 year ago

32ea9c5f

Init - first pass.

Vaibhavs10 committed 1 year ago

024bd294

common : suggest --jinja when autodetection fails (#14222)

CISC committed 1 year ago

Verified e434e691

server : fix incorrect usage of llama_get_embeddings() (#14225)

ggerganov committed 1 year ago

Verified 89fea80d

llama : add thread safety test (#14035)

slaren committed 1 year ago

Verified 6adc3c3e

cmake: clean up external project logic for vulkan-shaders-gen (#14179)

mtmcp committed 1 year ago

Verified 0dbcabde

model : add NeoBERT (#14164)

huydt84 committed 1 year ago

Verified ad590be9

HIP: disable rocwmma on gfx12 by default until rocm 7.0 (#14202)

IMbackK committed 1 year ago

Verified 7d6d91ba

llama : rework embeddings logic (#14208)

ggerganov committed 1 year ago

Verified d3e64b9f

ggml: Add Android support for GGML_CPU_ALL_VARIANTS (#14206)

chaxu01 committed 1 year ago

Verified 3ba0d843

convert : remove arcee change in convert_hf_to_gguf_update.py (#14207)

bartowski1182 committed 1 year ago

Verified 0bf49eb6

gguf-py : allow key override when adding value to GGUFWriter (#14194)

huydt84 committed 1 year ago

Verified 4ad24367

vulkan: mutex around vkQueueSubmit (#14127)

jeffbolznv committed 1 year ago

Verified c89c2d1a

ggml-cpu : rework weak alias on apple targets (#14146)

xctan committed 1 year ago

Verified 3555b300

model : Add support for Arcee AI's upcoming AFM model (#14185)

bartowski1182 committed 1 year ago

Verified d7da8dc8

server : When listening on a unix domain socket don't print http:// and port (#14180)

ericcurtin committed 1 year ago

Verified cd355eda

quantize : change int to unsigned int for KV overrides (#14197)

EAddario committed 1 year ago

Verified 30e5b01d

CUDA/HIP: fix ssm_scan on devices where warp size is not 32 (#14196)

IMbackK committed 1 year ago

Verified e54b3940

HIP: Replace usage of depricated preprocessor macro __AMDGCN_WAVEFRONT_SIZE__ (#14183)

IMbackK committed 1 year ago

Verified 2c2caa44

kv-cache : fix use-after-move of defrag info (#14189)

ggerganov committed 1 year ago

Verified 5fce5f94

model : add dots.llm1 architecture support (#14044) (#14118)

Noeda committed 1 year ago

Verified 9ae4143b

cparams : rename LLAMA_MAX_PARALLEL_SEQUENCES to LLAMA_MAX_SEQ (#14188)

ggerganov committed 1 year ago

Verified c311ac66

batch : auto-gen positions + verify multi-sequence input (#14177)

ggerganov committed 1 year ago

Verified b9912ac5

docs : remove WIP since PR has been merged (#13912)

pepijndevos committed 1 year ago

Verified 00ba7726

llama-chat : Do not throw when tool parsing fails (#14012)

Piotr committed 1 year ago

Verified 3cb203c8

compare-llama-bench: add option to plot (#14169)

am17an committed 1 year ago

Verified 2e42be42

Older