llama : refactor graph build code #3837
llama : factor out ggml-alloc from graph graph build functions
8b2420d2
metal : disable kernel load log
5946d98f
llama : factor out tensor offloading outside the build call (wip)
38aca9e1
llama : offload rest of the models
83d2c437
llama : update offload log messages to print node index
3af87713
llama : comments
51c4f9ee
slaren
commented
on 2023-10-29
llama : support offloading result_norm + comments
4e98897e
llama : factor graph input into a function
0dc05b84
llama : do tensor offload only with CUDA
e14aa461
llama : fix res_norm offloading
79617902
llama : try to optimize offloading code
b4ad03b3
ggerganov
force pushed
from
66a54bfe
to
b4ad03b3
2 years ago
llama : fix non-CUDA build
25cfbf67
llama : try to fix build
739b85c9
llama : move refact in correct place + optimize graph input
da936188
llama : refactor tensor offloading as callback
1e9c5443
llama : add layer index to all tensor names
8925cf9e
llama : add functional header
76108793
llama : comment
79ad7344
llama : remove obsolete map for layer counting
210e6e5d
llama : add llm_build helper functions (#3848)
5baefef4
ggerganov
marked this pull request as ready for review 2 years ago
Merge branch 'master' into llama-refactor
afb39292
ggerganov
merged
71e3718a
into master 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub