llama.cpp
cuda : improve cuda pool efficiency using virtual memory
#4606
Merged

Commits
  • cuda : improve cuda pool efficiency using virtual memory
    slaren committed 2 years ago
  • fix mixtral
    slaren committed 2 years ago
  • fix cmake build
    slaren committed 2 years ago
  • check for vmm support, disable for hip
    slaren committed 2 years ago
  • fix hip build
    slaren committed 2 years ago
  • clarify granularity
    slaren committed 2 years ago
  • move all caps to g_device_caps
    slaren committed 2 years ago
  • refactor error checking
    slaren committed 2 years ago
  • add cuda_pool_alloc, refactor most pool allocations
    slaren committed 2 years ago
  • fix hip build
    slaren committed 2 years ago
  • CUBLAS_TF32_TENSOR_OP_MATH is not a macro
    slaren committed 2 years ago
  • more hip crap
    slaren committed 2 years ago
  • llama : fix msvc warnings
    slaren committed 2 years ago
  • ggml : fix msvc warnings
    slaren committed 2 years ago
  • minor
    slaren committed 2 years ago
  • Merge remote-tracking branch 'origin/master' into sl/cuda-virt-pool
    slaren committed 2 years ago
  • minor
    slaren committed 2 years ago
  • cuda : fallback to CPU on host buffer alloc fail
    slaren committed 2 years ago
  • Update ggml-cuda.cu
    slaren committed 2 years ago
  • Update ggml-cuda.cu
    slaren committed 2 years ago
  • ensure allocations are always aligned
    slaren committed 2 years ago
  • act_size -> actual_size
    slaren committed 2 years ago
Loading