llama.cpp
Nomic Vulkan backend
#4456
Merged

Commits
  • Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.
    cebtenzzre committed 2 years ago
  • Remove warning which fails on windows.
    cebtenzzre committed 2 years ago
  • remove dynamic deps from kompute build
    cebtenzzre committed 2 years ago
  • Switch to a dynamic dispatch table instead of linking hard against libvulkan.
    cebtenzzre committed 2 years ago
  • Completely revamp how we do object management with the vulkan backend and
    cebtenzzre committed 2 years ago
  • Make kompute actually include external SDK headers when requested
    cebtenzzre committed 2 years ago
  • Throw an exception when allocation fails for vulkan.
    cebtenzzre committed 2 years ago
  • vulkan: disambiguate gpus with the same name
    cebtenzzre committed 2 years ago
  • Don't try and install kompute artifacts.
    cebtenzzre committed 2 years ago
  • Sync from device back to host at begin of new prompt.
    cebtenzzre committed 2 years ago
  • Only use vulkan with known quant that work.
    cebtenzzre committed 2 years ago
  • Set the singleton to nullptr here.
    cebtenzzre committed 2 years ago
  • Don't crash on available devices if we can't even create an instance.
    cebtenzzre committed 2 years ago
  • Support for gguf.
    cebtenzzre committed 2 years ago
  • kompute : don't fail build because of -Warray-bounds
    cebtenzzre committed 2 years ago
  • Upload immediately to device.
    cebtenzzre committed 2 years ago
  • Add a common boilerplate code via include and elim copy pasta
    cebtenzzre committed 2 years ago
  • Consolidate code for mat x vec kernels and use subgroups more extensively.
    cebtenzzre committed 2 years ago
  • Move the subgroups and printf into common.
    cebtenzzre committed 2 years ago
  • Minor cleanup.
    cebtenzzre committed 2 years ago
  • Refactor getrows to use common code and get ready for q6_k.
    cebtenzzre committed 2 years ago
  • Add q6_k getrows and mul*vec kernel.
    cebtenzzre committed 2 years ago
  • Fix offset into the qh and now we have working vulkan accelerated for gguff'd llama.
    cebtenzzre committed 2 years ago
  • Fixes for norm.
    cebtenzzre committed 2 years ago
  • Fixup the upstream CMakelists.txt so we can build just llama.cpp with our branch.
    cebtenzzre committed 2 years ago
  • Change this back to be in agreement with metal and our previous softmax kernel.
    cebtenzzre committed 2 years ago
  • Fixes for subgroup size to bring AMD and NVIDIA inline with eachother for all kernels.
    cebtenzzre committed 2 years ago
  • kompute : only try to use Vulkan for LLaMA itself
    cebtenzzre committed 2 years ago
  • kompute : remove Q6_K from list of supported quant types
    cebtenzzre committed 2 years ago
  • f16 mv broadcasting fix (gqa fix)
    cebtenzzre committed 2 years ago
  • q8 mat*vec
    cebtenzzre committed 2 years ago
  • vulkan: implement neox mode for rope
    cebtenzzre committed 2 years ago
  • falcon h2d + reenable vulkan
    cebtenzzre committed 2 years ago
  • Delete TODO now that we have q8_0.
    cebtenzzre committed 2 years ago
  • add mat*mat ops
    cebtenzzre committed 2 years ago
  • misc vulkan cleanup
    cebtenzzre committed 2 years ago
  • perf: use bigger threadgroups in mm
    cebtenzzre committed 2 years ago
  • use op param epsilon for norms
    cebtenzzre committed 2 years ago
  • q6k mm works
    cebtenzzre committed 2 years ago
  • rm commented dbg print
    cebtenzzre committed 2 years ago
  • q4_1 mat*mat
    cebtenzzre committed 2 years ago
  • clean up vulkan/cpu switch
    cebtenzzre committed 2 years ago
  • attempted speedups
    cebtenzzre committed 2 years ago
  • attempted speedups 2
    cebtenzzre committed 2 years ago
  • use mat*vec shaders for mat*mat
    cebtenzzre committed 2 years ago
  • kompute : enable kp_logger and make it static (#8)
    cebtenzzre committed 2 years ago
  • kompute : make scripts executable
    cebtenzzre committed 2 years ago
  • Don't try an allocation on a heap that is smaller than the size we require.
    cebtenzzre committed 2 years ago
  • Remove unused push constant that was giving validation errors.
    cebtenzzre committed 2 years ago
  • Lower the workgroup count for some shaders by providing a loop that processes
    cebtenzzre committed 2 years ago
  • Fix synchronization problem for AMD Radeon with amdvlk driver or windows
    cebtenzzre committed 2 years ago
  • vulkan : fix missing break in matmul selection (#9)
    cebtenzzre committed 2 years ago
  • llama : decide to disable Vulkan before loading tensors (#7)
    cebtenzzre committed 2 years ago
  • Scale the workgroup count down to allow correct generation for falcon with
    cebtenzzre committed 2 years ago
  • Revert the prompt processing on gpu for now.
    cebtenzzre committed 2 years ago
  • Remove this debug code.
    cebtenzzre committed 2 years ago
  • llama : fix Vulkan whitelist (#11)
    cebtenzzre committed 2 years ago
  • kompute : fix issues with debug layers
    cebtenzzre committed 2 years ago
  • fix build with external fmtlib (v10)
    cebtenzzre committed 2 years ago
  • Merge commit 'ec893798b7a2a803466cc8f063051499ec3d96f7' into HEAD
    cebtenzzre committed 2 years ago
  • vulkan : replace ggml_diag_mask_inf with ggml_add (custom -inf mask)
    cebtenzzre committed 2 years ago
  • vulkan : rope n_past is now KQ_pos, f16 rope kernel
    cebtenzzre committed 2 years ago
  • vulkan : optimize workgroup sizes
    cebtenzzre committed 2 years ago
  • Merge commit 'fcca0a700487999d52a525c96d6661e9f6a8703a' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • vulkan : assert various kernel requirements
    cebtenzzre committed 2 years ago
  • Merge commit '469c9addef75893e6be12edda852d12e840bf064' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • vulkan : handle ggml_scale for n%8 != 0
    cebtenzzre committed 2 years ago
  • Merge commit 'e16b9fa4baa8a09c6619b116159830e898050942' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • mention skipped change
    cebtenzzre committed 2 years ago
  • merge fixup (e16b9fa4baa8a09c6619b116159830e898050942)
    cebtenzzre committed 2 years ago
  • Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d~' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • vulkan : implement YaRN RoPE scaling (#2268)
    cebtenzzre committed 2 years ago
  • Merge commit '4760e7cc0b68570d58f55e8dda469805d1759d0d' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • vulkan : sync with "migrate to dynamic graphs"
    cebtenzzre committed 2 years ago
  • Merge remote-tracking branch 'upstream/master' into nomic-vulkan-redo
    cebtenzzre committed 2 years ago
  • relicense Vulkan backend as MIT
    cebtenzzre committed 2 years ago
  • rename ggml-vulkan -> ggml-kompute
    cebtenzzre committed 2 years ago
  • separate shaders from kompute itself
    cebtenzzre committed 2 years ago
  • Merge commit '81bc9214a389362010f7a57f4cbc30e5f83a2d28' into nomic-vulkan
    cebtenzzre committed 2 years ago
  • kompute : fix compile warnings
    cebtenzzre committed 2 years ago
  • move kompute to a submodule
    cebtenzzre committed 2 years ago
  • remove script with unclear purpose
    cebtenzzre committed 2 years ago
  • ggml : restore 'static' specifiers
    cebtenzzre committed 2 years ago
  • refactor llama.cpp modifications
    cebtenzzre committed 2 years ago
  • vulkan : fix free of stack addr in llama_buffer
    cebtenzzre committed 2 years ago
  • kompute : always destroy Manager via the destructor
    cebtenzzre committed 2 years ago
  • kompute : fix -Wunused-private-field warnings from clang
    cebtenzzre committed 2 years ago
  • Merge commit 'bcc0eb4591bec5ec02fad3f2bdcb1b265052ea56' into ceb/nomic-vulkan
    cebtenzzre committed 2 years ago
  • Merge commit '31f27758faf4a4bd08101a57c7ec3a473f771f86' into ceb/nomic-vulkan
    cebtenzzre committed 2 years ago
  • sync xxd commands with GPT4All llama.cpp.cmake
    cebtenzzre committed 2 years ago
  • Merge commit 'd232aca5a73b290e218a2e48b91023d5e994203f' into ceb/nomic-vulkan
    cebtenzzre committed 2 years ago
  • Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/nomic-vulkan
    cebtenzzre committed 2 years ago
  • Merge commit 'e7e4df031b9e29d4b55a4e0b0295187f6b213db1' into HEAD
    cebtenzzre committed 2 years ago
  • kompute : initial attempt at ggml-backend v2 support
    cebtenzzre committed 2 years ago
  • fix assertion failure
    cebtenzzre committed 2 years ago
  • attempt to get test-backend-ops working
    cebtenzzre committed 2 years ago
  • add sanity check and fix kompute teardown order
    cebtenzzre committed 2 years ago
  • kompute : ignore exceptions in ggml_vk_available_devices (#12)
    cebtenzzre committed 2 years ago
  • kompute : fix rope_f32 and scale ops (#5008)
    cebtenzzre committed 2 years ago
  • clean up old backend code
    cebtenzzre committed 2 years ago
  • + more commits ...
Loading