Commits
  • sycl: addressing non-contiguous src1 mul_mats (nc and batched) (llama/13343)
    ggerganov committed 1 year ago
  • vulkan: Allow up to 4096 elements for mul_mat_id row_ids (llama/13326)
    ggerganov committed 1 year ago
  • rpc : add rpc_msg_set_tensor_hash_req (llama/13353)
    ggerganov committed 1 year ago
  • CUDA: fix crash on large batch size for MoE models (llama/13384)
    ggerganov committed 1 year ago
  • CUDA: FA support for Deepseek (Ampere or newer) (llama/13306)
    ggerganov committed 1 year ago
  • sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858)
    ggerganov committed 1 year ago
  • vulkan: scalar flash attention implementation (llama/13324)
    ggerganov committed 1 year ago
  • CUDA: fix FlashAttention on Turing (llama/13415)
    ggerganov committed 1 year ago
  • CUDA: fix race conditions FlashAttention kernels (llama/13438)
    ggerganov committed 1 year ago
  • Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386)
    ggerganov committed 1 year ago
  • CUDA: fix crash with partial offloading of MoE (llama/13439)
    ggerganov committed 1 year ago
  • enable dpcpp nightly builds with libraries (llama/13406)
    ggerganov committed 1 year ago
  • CUDA: fix misaligned synchronization in FA (llama/13469)
    ggerganov committed 1 year ago
  • ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (llama/13053)
    ggerganov committed 1 year ago
  • llama/ggml: add LLM training support (llama/10544)
    ggerganov committed 1 year ago
  • opencl: remove unnecessary assert for `add` (llama/13257)
    ggerganov committed 1 year ago
  • metal : optimize MoE for large batches (llama/13388)
    ggerganov committed 1 year ago
  • ggml : add mrope kernel for metal (llama/13457)
    ggerganov committed 1 year ago
  • sync : ggml
    ggerganov committed 1 year ago
  • whisper : update to ggml-backend changes (#0)
    ggerganov committed 1 year ago
  • talk-llama : sync llama.cpp
    ggerganov committed 1 year ago
Loading