PR #15906 metal : make the backend async v2 - SemanticDiff

Commits

metal : make the backend async

ggerganov committed 141 days ago
cont : add comments, extend op offload, clean up

ggerganov committed 141 days ago
metal : fix batch size for MUL_MAT_ID

ggerganov committed 141 days ago
metal : remove deprecated ggml_backend_metal_buffer_from_ptr

ggerganov committed 141 days ago
metal : create only metal buffers, no wrapping of host memory

ggerganov committed 140 days ago
metal : restore .alloc_buffer for buffer_from_ptr_type

ggerganov committed 140 days ago
metal : remove broken implementation of GGML_OP_SET

ggerganov committed 140 days ago
metal : clean-up loose ends, ready for tests

ggerganov committed 140 days ago
metal : support both private and shared buffers

ggerganov committed 140 days ago
metal : enable private buffers + add global device queue

ggerganov committed 140 days ago
metal : disable host buffer to prevent races

ggerganov committed 140 days ago
metal : avoid extra copy during set_tensor

ggerganov committed 140 days ago
metal : use separate buffer types for shread and private Metal buffers

ggerganov committed 140 days ago
metal : simplify synchronization logic

ggerganov committed 140 days ago
metal : fix build

ggerganov committed 139 days ago
metal : do not implement cpy_tensor

ggerganov committed 139 days ago
metal : separate implementations for shared and private buffers

ggerganov committed 139 days ago

Loading