llama.cpp
metal : make the backend async v2
#15906
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
17
Changes
View On
GitHub
Commits
metal : make the backend async
ggerganov
committed
141 days ago
cont : add comments, extend op offload, clean up
ggerganov
committed
141 days ago
metal : fix batch size for MUL_MAT_ID
ggerganov
committed
141 days ago
metal : remove deprecated ggml_backend_metal_buffer_from_ptr
ggerganov
committed
141 days ago
metal : create only metal buffers, no wrapping of host memory
ggerganov
committed
140 days ago
metal : restore .alloc_buffer for buffer_from_ptr_type
ggerganov
committed
140 days ago
metal : remove broken implementation of GGML_OP_SET
ggerganov
committed
140 days ago
metal : clean-up loose ends, ready for tests
ggerganov
committed
140 days ago
metal : support both private and shared buffers
ggerganov
committed
140 days ago
metal : enable private buffers + add global device queue
ggerganov
committed
140 days ago
metal : disable host buffer to prevent races
ggerganov
committed
140 days ago
metal : avoid extra copy during set_tensor
ggerganov
committed
140 days ago
metal : use separate buffer types for shread and private Metal buffers
ggerganov
committed
140 days ago
metal : simplify synchronization logic
ggerganov
committed
140 days ago
metal : fix build
ggerganov
committed
139 days ago
metal : do not implement cpy_tensor
ggerganov
committed
139 days ago
metal : separate implementations for shared and private buffers
ggerganov
committed
139 days ago
Loading