huggingface/text-generation-inference

Pull Requests Commits

misc: use return Ok(())

mfuntowicz committed 1 year ago

Verified 182ffaf0

feat(backend): use c++ defined types for llama.cpp

mfuntowicz committed 1 year ago

e0dda9b6

feat(backend): better map exception throw on C++ side

mfuntowicz committed 1 year ago

c9f6c3a8

feat(backend): add mimalloc memory allocator to the container

mfuntowicz committed 1 year ago

db41776a

feat(backend): correctly link to all libraries

mfuntowicz committed 1 year ago

f5c4cee3

feat: Fix Cmakelist to allow building on Darwin platform (#2785)

Hugoch committed 1 year ago

Verified 59b0ef30

feat(backend): use new batch API to generate tokens

mfuntowicz committed 1 year ago

b10eaab9

feat(backend): create llama_context_params with default factory

mfuntowicz committed 1 year ago

dc6435e3

feat(backend): update llama.cpp to 4215

mfuntowicz committed 1 year ago

b1ebc8f7

misc(offline): update model creation as std::shared_ptr

mfuntowicz committed 1 year ago

6c5a75b5

feat(backend): add missing temperature parameter

mfuntowicz committed 1 year ago

9d659f1e

feat(backend): add guard in case top_k = 0

mfuntowicz committed 1 year ago

df72c56b

feat(backend): add some test to the backend for core allocation

mfuntowicz committed 1 year ago

929a2fc7

feat(backend): fix when num_cores_per_instance is equals to zero with the size of the generated core allocation

mfuntowicz committed 1 year ago

298367cd

feat(backend): use the new batch api from llama

mfuntowicz committed 1 year ago

8e897935

feat(backend): remove core overriding in the Rust backend

mfuntowicz committed 1 year ago

274cfce4

Update Dockerfile.llamacpp as per review

mfuntowicz committed 1 year ago

Verified d918e6a1

Update Dockerfile.llamacpp as per review

mfuntowicz committed 1 year ago

Verified bbe95ca9

chore: remove unrelated change to trtllm

mfuntowicz committed 1 year ago

9025a26c

misc(doc): rust documentation

mfuntowicz committed 1 year ago

862a519f

misc(doc): c++ documentation

mfuntowicz committed 1 year ago

b9c04b9c

misc(license): update LICENSE

mfuntowicz committed 1 year ago

4ee2ee58

misc(backend): allow rebinding numa core affinity

mfuntowicz committed 1 year ago

2d9465d1

misc(docker): add numa lib as dependency

mfuntowicz committed 1 year ago

30ae9963

feat(backend): rely on multi consumer queue to scheduler workers

mfuntowicz committed 1 year ago

5a856616

feat(backend): correctly setup llama_context providing n_threads and n_ubatch

mfuntowicz committed 1 year ago

84eead21

feat(backend): bind thread and memory affinity for thread

mfuntowicz committed 1 year ago

50c37661

feat(backend): multistream inference on CPU

mfuntowicz committed 1 year ago

5335bf97

misc(build): improve build process

mfuntowicz committed 1 year ago

23d2bcf2

feat(backend): update llamacpp to 4077

mfuntowicz committed 1 year ago

70c90ad9

Older