llama.cpp
cd1fce6d - SYCL: Add set_rows support for quantized types (#14883)

Commit

43 days ago

SYCL: Add set_rows support for quantized types (#14883) * SYCL: Add set_rows support for quantized types This commit adds support for GGML_OP_SET_ROWS operation for various quantized tensor types (Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, IQ4_NL) and BF16 type in the SYCL backend. The quantization/dequantization copy kernels were moved from cpy.cpp to cpy.hpp to make them available for set_rows.cpp. This addresses part of the TODOs mentioned in the code. * Use get_global_linear_id() instead ggml-ci * Fix formatting ggml-ci * Use const for ne11 and size_t variables in set_rows_sycl_q ggml-ci * Increase block size for q kernel to 256 ggml-ci * Cleanup imports * Add float.h to cpy.hpp

References

#14883 - SYCL: Add set_rows support for quantized types

Author

qnixsynapse

Parents

00fa15fe

llama.cpp cd1fce6d - SYCL: Add set_rows support for quantized types (#14883)

llama.cpp
cd1fce6d - SYCL: Add set_rows support for quantized types (#14883)