PR #1459 OpenCL dequant_mul_mat

OpenCL dequant_mul_mat #1459

SlyEcho merged 18 commits into ggml-org:master from 0cc4m:opencl-dev

github-actions commented on 2023-05-14

JohannesGaessler commented on 2023-05-14

github-actions commented on 2023-05-16

JohannesGaessler commented on 2023-05-16

0cc4m marked this pull request as ready for review 2 years ago

github-actions commented on 2023-05-17

github-actions commented on 2023-05-18

github-actions commented on 2023-05-19

Move back to C++ for OpenCL

a7e3bee4

Refactor OpenCL code to work more like the CUDA code, add missing fun…

17e53dbb

Fix bugs in dequant_mul_mat code

5f610c90

Fix dequant_mul_mat kernel

8c7a7cea

Add remaining dequant_mul_mat functions

cb588e2a

Fix CMakeLists.txt

19683803

Generate dequant_mul_mat kernels from simple templates

915d0d11

Fix error in convert f16 to f32 kernel call

cda2d488

Fix tensor load to device

42e1a2ba

Deduplicate dequant kernels

457eff92

Fix convert_row_f16 kernel issue

e41a7ae4

Add OpenCL compile options

a1657d02

Use compile args for preprocessing constants

b6b39960

0cc4m force pushed from fb638fa8 to b6b39960 2 years ago

github-actions commented on 2023-05-21

Explicitely set GEMM type

18e9dd87

Only copy f16/f32 buffer if not already on GPU

4a559514

github-actions commented on 2023-05-22

SlyEcho requested changes on 2023-05-21

change to fprintf

e1ee2810

SlyEcho approved these changes on 2023-05-22

Restore default platform + device selection by id behavior

4dfd4fe1

github-actions commented on 2023-05-22

SlyEcho requested changes on 2023-05-22

Small compiler warning fixes

cb28080a

SlyEcho approved these changes on 2023-05-22

SlyEcho merged 2e6cd4b0 into master 2 years ago

0cc4m deleted the opencl-dev branch 2 years ago

Reviewers

SlyEcho

github-actions

JohannesGaessler

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

llama.cpp OpenCL dequant_mul_mat #1459 Merged

OpenCL dequant_mul_mat #1459

llama.cpp
OpenCL dequant_mul_mat
#1459

Merged