llama_model_loader: support multiple split/shard GGUFs #6187
split: support in llama_model_loader
7c64fef9
ngxson
commented
on 2024-03-20
slaren
commented
on 2024-03-21
Avoir copying the entire vector
b8feff41
split: move llama_tensor_offset to llama_model_loader
18ff6ca8
Merge branch 'master' into hp/split/load-model
60a87ae0
llama_model_loader: PR feedbacks:
1892ae7e
avoid copying the entire vector
00381b07
Simplify this by making these optional, switch some layer creation te…
c34a5dee
Handle optional tensors
1c931f3d
llama_model_loader: fail if backend cannot allocate buffer
d8b567d2
ngxson
commented
on 2024-03-21
ngxson
commented
on 2024-03-21
fix mmap buffer management
02020b04
ngxson
commented
on 2024-03-21
llama_model_loader: map file to backend buffer if the allocation succ…
078a1aca
llama_model_loader: only map tensors included in the context
69bdee93
llama_model_loader: minor, use same variable name for consistency, fi…
6df9757a
slaren
commented
on 2024-03-21
llama_model_loader: fail if any of backend buffer cannot be allocated
f9a29735
slaren
commented
on 2024-03-21
spacing
0fd652eb
fix loop over pointer
1a179bfc
slaren
commented
on 2024-03-21
llama_model_loader: if n_tensors declared not equals to loaded tensor…
7cbe1eac
llama_model_loader: ensure mappings vector has the expected size
9940df4f
llama_model_loader: use at instead of operator[] if this should neve…
ec372c66
llama_model_loader: immediately add the backend buffer to the model b…
a9e88c6e
llama_model_loader: be sure the model mappings has enough capacity be…
b19af364
llama_model_loader: fix map -> unordered map
4c044009
phymbert
changed the title llama_model_loader: support multiple split GGUFs llama_model_loader: support multiple split/shard GGUFs 2 years ago
llama_split_prefix: use a clearer version, not pass split path len bu…
e474e456
llama : minor
8326607c
llama : introduce some typedef helpers
dbc35acf
ggerganov
approved these changes
on 2024-03-22
docs: add model shard in hot topic
f616b38b
ngxson
approved these changes
on 2024-03-22
slaren
commented
on 2024-03-22
llama_model_loader: put mapping in a unique_ptr from the moment it is…
1f387599
fix llama_split_prefix
764c7afe
slaren
approved these changes
on 2024-03-22
phymbert
merged
dba1af61
into master 2 years ago
phymbert
deleted the hp/split/load-model branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub