llama.cpp
SYCL: Implement few same quantized type copy kernels
#13739
Merged

SYCL: Implement few same quantized type copy kernels #13739

qnixsynapse merged 6 commits into master from sycl/same_q_cpy
qnixsynapse
qnixsynapse qnixsynapse marked this pull request as draft 1 year ago
github-actions github-actions added ggml
github-actions github-actions added SYCL
qnixsynapse qnixsynapse marked this pull request as ready for review 1 year ago
Rbiessy
Rbiessy commented on 2025-05-26
qnixsynapse
Rbiessy
qnixsynapse
qnixsynapse qnixsynapse marked this pull request as draft 1 year ago
qnixsynapse qnixsynapse force pushed to 3cdc64b9 1 year ago
qnixsynapse qnixsynapse force pushed from 3cdc64b9 to c8c22786 1 year ago
qnixsynapse
qnixsynapse qnixsynapse marked this pull request as ready for review 1 year ago
qnixsynapse qnixsynapse requested a review from Alcpz Alcpz 1 year ago
qnixsynapse SYCL: Implement few same quantized type copy kernels
c26934dd
qnixsynapse Use memcpy for copying contiguous tensors
608e8811
qnixsynapse feat(sycl): add contiguous tensor copy support and device checks
faeb7f34
qnixsynapse refactor: replace specific block copy functions with template
b36c550d
qnixsynapse Exclude BF16 support for COPY tensors for now
b6db0056
qnixsynapse qnixsynapse force pushed to b6db0056 1 year ago
Rbiessy
Rbiessy commented on 2025-06-03
qnixsynapse perf: adjust SYCL copy kernel block sizes for efficiency
4aa261af
Rbiessy
Rbiessy approved these changes on 2025-06-06
qnixsynapse qnixsynapse merged 228f34c9 into master 1 year ago
qnixsynapse qnixsynapse deleted the sycl/same_q_cpy branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone