llama.cpp
Nomic Vulkan backend
#4456
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
155
Changes
View On
GitHub
Commits
Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.
cebtenzzre
committed
2 years ago
Remove warning which fails on windows.
cebtenzzre
committed
2 years ago
remove dynamic deps from kompute build
cebtenzzre
committed
2 years ago
Switch to a dynamic dispatch table instead of linking hard against libvulkan.
cebtenzzre
committed
2 years ago
Completely revamp how we do object management with the vulkan backend and
cebtenzzre
committed
2 years ago
Make kompute actually include external SDK headers when requested
cebtenzzre
committed
2 years ago
Throw an exception when allocation fails for vulkan.
cebtenzzre
committed
2 years ago
vulkan: disambiguate gpus with the same name
cebtenzzre
committed
2 years ago
Don't try and install kompute artifacts.
cebtenzzre
committed
2 years ago
Sync from device back to host at begin of new prompt.
cebtenzzre
committed
2 years ago
Only use vulkan with known quant that work.
cebtenzzre
committed
2 years ago
Set the singleton to nullptr here.
cebtenzzre
committed
2 years ago
Don't crash on available devices if we can't even create an instance.
cebtenzzre
committed
2 years ago
Support for gguf.
cebtenzzre
committed
2 years ago
kompute : don't fail build because of -Warray-bounds
cebtenzzre
committed
2 years ago
Upload immediately to device.
cebtenzzre
committed
2 years ago
Add a common boilerplate code via include and elim copy pasta
cebtenzzre
committed
2 years ago
Consolidate code for mat x vec kernels and use subgroups more extensively.
cebtenzzre
committed
2 years ago
Move the subgroups and printf into common.
cebtenzzre
committed
2 years ago
Minor cleanup.
cebtenzzre
committed
2 years ago
Refactor getrows to use common code and get ready for q6_k.
cebtenzzre
committed
2 years ago
Add q6_k getrows and mul*vec kernel.
cebtenzzre
committed
2 years ago
Fix offset into the qh and now we have working vulkan accelerated for gguff'd llama.
cebtenzzre
committed
2 years ago
Fixes for norm.
cebtenzzre
committed
2 years ago
Fixup the upstream CMakelists.txt so we can build just llama.cpp with our branch.
cebtenzzre
committed
2 years ago
Change this back to be in agreement with metal and our previous softmax kernel.
cebtenzzre
committed
2 years ago
Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels.
cebtenzzre
committed
2 years ago
kompute : only try to use Vulkan for LLaMA itself
cebtenzzre
committed
2 years ago
kompute : remove Q6_K from list of supported quant types
cebtenzzre
committed
2 years ago
f16 mv broadcasting fix (gqa fix)
cebtenzzre
committed
2 years ago
q8 mat*vec
cebtenzzre
committed
2 years ago
vulkan: implement neox mode for rope
cebtenzzre
committed
2 years ago
falcon h2d + reenable vulkan
cebtenzzre
committed
2 years ago
Delete TODO now that we have q8_0.
cebtenzzre
committed
2 years ago
add mat*mat ops
cebtenzzre
committed
2 years ago
misc vulkan cleanup
cebtenzzre
committed
2 years ago
perf: use bigger threadgroups in mm
cebtenzzre
committed
2 years ago
use op param epsilon for norms
cebtenzzre
committed
2 years ago
q6k mm works
cebtenzzre
committed
2 years ago
rm commented dbg print
cebtenzzre
committed
2 years ago
q4_1 mat*mat
cebtenzzre
committed
2 years ago
clean up vulkan/cpu switch
cebtenzzre
committed
2 years ago
attempted speedups
cebtenzzre
committed
2 years ago
attempted speedups 2
cebtenzzre
committed
2 years ago
use mat*vec shaders for mat*mat
cebtenzzre
committed
2 years ago
kompute : enable kp_logger and make it static (#8)
cebtenzzre
committed
2 years ago
kompute : make scripts executable
cebtenzzre
committed
2 years ago
Don't try an allocation on a heap that is smaller than the size we require.
cebtenzzre
committed
2 years ago
Remove unused push constant that was giving validation errors.
cebtenzzre
committed
2 years ago
Lower the workgroup count for some shaders by providing a loop that processes
cebtenzzre
committed
2 years ago
Fix synchronization problem for AMD Radeon with amdvlk driver or windows
cebtenzzre
committed
2 years ago
vulkan : fix missing break in matmul selection (#9)
cebtenzzre
committed
2 years ago
llama : decide to disable Vulkan before loading tensors (#7)
cebtenzzre
committed
2 years ago
Scale the workgroup count down to allow correct generation for falcon with
cebtenzzre
committed
2 years ago
Revert the prompt processing on gpu for now.
cebtenzzre
committed
2 years ago
Remove this debug code.
cebtenzzre
committed
2 years ago
llama : fix Vulkan whitelist (#11)
cebtenzzre
committed
2 years ago
kompute : fix issues with debug layers
cebtenzzre
committed
2 years ago
fix build with external fmtlib (v10)
cebtenzzre
committed
2 years ago
Merge commit 'ec893798b7a2a803466cc8f063051499ec3d96f7' into HEAD
cebtenzzre
committed
2 years ago
vulkan : replace ggml_diag_mask_inf with ggml_add (custom -inf mask)
cebtenzzre
committed
2 years ago
vulkan : rope n_past is now KQ_pos, f16 rope kernel
cebtenzzre
committed
2 years ago
vulkan : optimize workgroup sizes
cebtenzzre
committed
2 years ago
Merge commit 'fcca0a700487999d52a525c96d6661e9f6a8703a' into nomic-vulkan
cebtenzzre
committed
2 years ago
vulkan : assert various kernel requirements
cebtenzzre
committed
2 years ago
Merge commit '469c9addef75893e6be12edda852d12e840bf064' into nomic-vulkan
cebtenzzre
committed
2 years ago
vulkan : handle ggml_scale for n%8 != 0
cebtenzzre
committed
2 years ago
Merge commit 'e16b9fa4baa8a09c6619b116159830e898050942' into nomic-vulkan
cebtenzzre
committed
2 years ago
mention skipped change
cebtenzzre
committed
2 years ago
merge fixup (e16b9fa4baa8a09c6619b116159830e898050942)
cebtenzzre
committed
2 years ago
Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d~' into nomic-vulkan
cebtenzzre
committed
2 years ago
vulkan : implement YaRN RoPE scaling (#2268)
cebtenzzre
committed
2 years ago
Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d' into nomic-vulkan
cebtenzzre
committed
2 years ago
vulkan : sync with "migrate to dynamic graphs"
cebtenzzre
committed
2 years ago
Merge remote-tracking branch 'upstream/master' into nomic-vulkan-redo
cebtenzzre
committed
2 years ago
relicense Vulkan backend as MIT
cebtenzzre
committed
2 years ago
rename ggml-vulkan -> ggml-kompute
cebtenzzre
committed
2 years ago
separate shaders from kompute itself
cebtenzzre
committed
2 years ago
Merge commit '81bc9214a389362010f7a57f4cbc30e5f83a2d28' into nomic-vulkan
cebtenzzre
committed
2 years ago
kompute : fix compile warnings
cebtenzzre
committed
2 years ago
move kompute to a submodule
cebtenzzre
committed
2 years ago
remove script with unclear purpose
cebtenzzre
committed
2 years ago
ggml : restore 'static' specifiers
cebtenzzre
committed
2 years ago
refactor llama.cpp modifications
cebtenzzre
committed
2 years ago
vulkan : fix free of stack addr in llama_buffer
cebtenzzre
committed
2 years ago
kompute : always destroy Manager via the destructor
cebtenzzre
committed
2 years ago
kompute : fix -Wunused-private-field warnings from clang
cebtenzzre
committed
2 years ago
Merge commit 'bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56' into ceb/nomic-vulkan
cebtenzzre
committed
2 years ago
Merge commit '31f27758faf4a4bd08101a57c7ec3a473f771f86' into ceb/nomic-vulkan
cebtenzzre
committed
2 years ago
sync xxd commands with GPT4All llama.cpp.cmake
cebtenzzre
committed
2 years ago
Merge commit 'd232aca5a73b290e218a2e48b91023d5e994203f' into ceb/nomic-vulkan
cebtenzzre
committed
2 years ago
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/nomic-vulkan
cebtenzzre
committed
2 years ago
Merge commit 'e7e4df031b9e29d4b55a4e0b0295187f6b213db1' into HEAD
cebtenzzre
committed
2 years ago
kompute : initial attempt at ggml-backend v2 support
cebtenzzre
committed
2 years ago
fix assertion failure
cebtenzzre
committed
2 years ago
attempt to get test-backend-ops working
cebtenzzre
committed
2 years ago
add sanity check and fix kompute teardown order
cebtenzzre
committed
2 years ago
kompute : ignore exceptions in ggml_vk_available_devices (#12)
cebtenzzre
committed
2 years ago
kompute : fix rope_f32 and scale ops (#5008)
cebtenzzre
committed
2 years ago
clean up old backend code
cebtenzzre
committed
2 years ago
+ more commits ...
Loading