PR #4456 Nomic Vulkan backend

Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.

cebtenzzre committed 2 years ago

Remove warning which fails on windows.

cebtenzzre committed 2 years ago

remove dynamic deps from kompute build

cebtenzzre committed 2 years ago

Switch to a dynamic dispatch table instead of linking hard against libvulkan.

cebtenzzre committed 2 years ago

Completely revamp how we do object management with the vulkan backend and

cebtenzzre committed 2 years ago

Make kompute actually include external SDK headers when requested

cebtenzzre committed 2 years ago

Throw an exception when allocation fails for vulkan.

cebtenzzre committed 2 years ago

vulkan: disambiguate gpus with the same name

cebtenzzre committed 2 years ago

Don't try and install kompute artifacts.

cebtenzzre committed 2 years ago

Sync from device back to host at begin of new prompt.

cebtenzzre committed 2 years ago

Only use vulkan with known quant that work.

cebtenzzre committed 2 years ago

Set the singleton to nullptr here.

cebtenzzre committed 2 years ago

Don't crash on available devices if we can't even create an instance.

cebtenzzre committed 2 years ago

Support for gguf.

cebtenzzre committed 2 years ago

kompute : don't fail build because of -Warray-bounds

cebtenzzre committed 2 years ago

Upload immediately to device.

cebtenzzre committed 2 years ago

Add a common boilerplate code via include and elim copy pasta

cebtenzzre committed 2 years ago

Consolidate code for mat x vec kernels and use subgroups more extensively.

cebtenzzre committed 2 years ago

Move the subgroups and printf into common.

cebtenzzre committed 2 years ago

Minor cleanup.

cebtenzzre committed 2 years ago

Refactor getrows to use common code and get ready for q6_k.

cebtenzzre committed 2 years ago

Add q6_k getrows and mul*vec kernel.

cebtenzzre committed 2 years ago

Fix offset into the qh and now we have working vulkan accelerated for gguff'd llama.

cebtenzzre committed 2 years ago

Fixes for norm.

cebtenzzre committed 2 years ago

Fixup the upstream CMakelists.txt so we can build just llama.cpp with our branch.

cebtenzzre committed 2 years ago

Change this back to be in agreement with metal and our previous softmax kernel.

cebtenzzre committed 2 years ago

Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels.

cebtenzzre committed 2 years ago

kompute : only try to use Vulkan for LLaMA itself

cebtenzzre committed 2 years ago

kompute : remove Q6_K from list of supported quant types

cebtenzzre committed 2 years ago

f16 mv broadcasting fix (gqa fix)

cebtenzzre committed 2 years ago

q8 mat*vec

cebtenzzre committed 2 years ago

vulkan: implement neox mode for rope

cebtenzzre committed 2 years ago

falcon h2d + reenable vulkan

cebtenzzre committed 2 years ago

Delete TODO now that we have q8_0.

cebtenzzre committed 2 years ago

add mat*mat ops

cebtenzzre committed 2 years ago

misc vulkan cleanup

cebtenzzre committed 2 years ago

perf: use bigger threadgroups in mm

cebtenzzre committed 2 years ago

use op param epsilon for norms

cebtenzzre committed 2 years ago

q6k mm works

cebtenzzre committed 2 years ago

rm commented dbg print

cebtenzzre committed 2 years ago

q4_1 mat*mat

cebtenzzre committed 2 years ago

clean up vulkan/cpu switch

cebtenzzre committed 2 years ago

attempted speedups

cebtenzzre committed 2 years ago

attempted speedups 2

cebtenzzre committed 2 years ago

use mat*vec shaders for mat*mat

cebtenzzre committed 2 years ago

kompute : enable kp_logger and make it static (#8)

cebtenzzre committed 2 years ago

kompute : make scripts executable

cebtenzzre committed 2 years ago

Don't try an allocation on a heap that is smaller than the size we require.

cebtenzzre committed 2 years ago

Remove unused push constant that was giving validation errors.

cebtenzzre committed 2 years ago

Lower the workgroup count for some shaders by providing a loop that processes

cebtenzzre committed 2 years ago

Fix synchronization problem for AMD Radeon with amdvlk driver or windows

cebtenzzre committed 2 years ago

vulkan : fix missing break in matmul selection (#9)

cebtenzzre committed 2 years ago

llama : decide to disable Vulkan before loading tensors (#7)

cebtenzzre committed 2 years ago

Scale the workgroup count down to allow correct generation for falcon with

cebtenzzre committed 2 years ago

Revert the prompt processing on gpu for now.

cebtenzzre committed 2 years ago

Remove this debug code.

cebtenzzre committed 2 years ago

llama : fix Vulkan whitelist (#11)

cebtenzzre committed 2 years ago

kompute : fix issues with debug layers

cebtenzzre committed 2 years ago

fix build with external fmtlib (v10)

cebtenzzre committed 2 years ago

Merge commit 'ec893798b7a2a803466cc8f063051499ec3d96f7' into HEAD

cebtenzzre committed 2 years ago

vulkan : replace ggml_diag_mask_inf with ggml_add (custom -inf mask)

cebtenzzre committed 2 years ago

vulkan : rope n_past is now KQ_pos, f16 rope kernel

cebtenzzre committed 2 years ago

vulkan : optimize workgroup sizes

cebtenzzre committed 2 years ago

Merge commit 'fcca0a700487999d52a525c96d6661e9f6a8703a' into nomic-vulkan

cebtenzzre committed 2 years ago

vulkan : assert various kernel requirements

cebtenzzre committed 2 years ago

Merge commit '469c9addef75893e6be12edda852d12e840bf064' into nomic-vulkan

cebtenzzre committed 2 years ago

vulkan : handle ggml_scale for n%8 != 0

cebtenzzre committed 2 years ago

Merge commit 'e16b9fa4baa8a09c6619b116159830e898050942' into nomic-vulkan

cebtenzzre committed 2 years ago

mention skipped change

cebtenzzre committed 2 years ago

merge fixup (e16b9fa4baa8a09c6619b116159830e898050942)

cebtenzzre committed 2 years ago

Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d~' into nomic-vulkan

cebtenzzre committed 2 years ago

vulkan : implement YaRN RoPE scaling (#2268)

cebtenzzre committed 2 years ago

Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d' into nomic-vulkan

cebtenzzre committed 2 years ago

vulkan : sync with "migrate to dynamic graphs"

cebtenzzre committed 2 years ago

Merge remote-tracking branch 'upstream/master' into nomic-vulkan-redo

cebtenzzre committed 2 years ago

relicense Vulkan backend as MIT

cebtenzzre committed 2 years ago

rename ggml-vulkan -> ggml-kompute

cebtenzzre committed 2 years ago

separate shaders from kompute itself

cebtenzzre committed 2 years ago

Merge commit '81bc9214a389362010f7a57f4cbc30e5f83a2d28' into nomic-vulkan

cebtenzzre committed 2 years ago

kompute : fix compile warnings

cebtenzzre committed 2 years ago

move kompute to a submodule

cebtenzzre committed 2 years ago

remove script with unclear purpose

cebtenzzre committed 2 years ago

ggml : restore 'static' specifiers

cebtenzzre committed 2 years ago

refactor llama.cpp modifications

cebtenzzre committed 2 years ago

vulkan : fix free of stack addr in llama_buffer

cebtenzzre committed 2 years ago

kompute : always destroy Manager via the destructor

cebtenzzre committed 2 years ago

kompute : fix -Wunused-private-field warnings from clang

cebtenzzre committed 2 years ago

Merge commit 'bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56' into ceb/nomic-vulkan

cebtenzzre committed 2 years ago

Merge commit '31f27758faf4a4bd08101a57c7ec3a473f771f86' into ceb/nomic-vulkan

cebtenzzre committed 2 years ago

sync xxd commands with GPT4All llama.cpp.cmake

cebtenzzre committed 2 years ago

Merge commit 'd232aca5a73b290e218a2e48b91023d5e994203f' into ceb/nomic-vulkan

cebtenzzre committed 2 years ago

Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/nomic-vulkan

cebtenzzre committed 2 years ago

Merge commit 'e7e4df031b9e29d4b55a4e0b0295187f6b213db1' into HEAD

cebtenzzre committed 2 years ago

kompute : initial attempt at ggml-backend v2 support

cebtenzzre committed 2 years ago

fix assertion failure

cebtenzzre committed 2 years ago

attempt to get test-backend-ops working

cebtenzzre committed 2 years ago

add sanity check and fix kompute teardown order

cebtenzzre committed 2 years ago

kompute : ignore exceptions in ggml_vk_available_devices (#12)

cebtenzzre committed 2 years ago

kompute : fix rope_f32 and scale ops (#5008)

cebtenzzre committed 2 years ago

clean up old backend code

cebtenzzre committed 2 years ago

llama.cpp Nomic Vulkan backend #4456 Merged

llama.cpp
Nomic Vulkan backend
#4456

Merged