llama.cpp
Clblast fixes + enhancements to save VRAM and offload more layers
#1675

Merged

Clblast fixes + enhancements to save VRAM and offload more layers #1675

0cc4m merged 12 commits into ggml-org:master from LostRuins:concedo-opencl-dev

Use events instead of clFinish, where possible

ebc5d065

OpenCL: Don't load gpu layers into RAM, add mul_f32 kernel

97c5cca4

Reduce queueing overhead for contiguous tensors by using single mul k…

ac6b49ed

Merge remote-tracking branch 'origin/master' into opencl-dev

49aaf083

Adapt to #1612 cl_mem malloc changes

5e1eecfe

Reduce code duplication between cuda and opencl branches

457aaf5b

Improve implementation

24239f0d

Clblast fixes + enhancements to save VRAM:

59fe1687

github-actions commented on 2023-06-02

Merge branch 'master' into concedo-opencl-dev

2b700749

change max value size_t to use limits

64e3e745

github-actions commented on 2023-06-04

0cc4m commented on 2023-06-04

removed flags from the CL pool malloc, apply code tidying suggestions.

f6431ded

LostRuins requested a review from

0cc4m 3 years ago

0cc4m requested changes on 2023-06-06

Update ggml-opencl.cpp

b6dd367b

0cc4m requested a review from

0cc4m 3 years ago

0cc4m approved these changes on 2023-06-06

0cc4m merged d5b111f5 into master 3 years ago

LostRuins deleted the concedo-opencl-dev branch 3 years ago

Reviewers

0cc4m

github-actions

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

llama.cpp Clblast fixes + enhancements to save VRAM and offload more layers #1675 Merged

Clblast fixes + enhancements to save VRAM and offload more layers #1675

llama.cpp
Clblast fixes + enhancements to save VRAM and offload more layers
#1675

Merged