ggerganov/llama.cpp

Pull Requests Commits

Vectorize q load

Aidan committed 339 days ago

a235b7c5

Store scales in local mem

Aidan committed 339 days ago

604ef6bf

Single load for half2

Aidan committed 339 days ago

cb3fb420

Remove double lines

Aidan committed 339 days ago

4a481556

Merge pull request #7920 from ggerganov/codeplay/revert-host-alloc

joeatodd committed 342 days ago

Verified ff076b88

Merge pull request #7919 from ggerganov/codeplay/unify-rope-sycl

joeatodd committed 342 days ago

Verified b2c8c831

Replace powf with sycl::pow in ggml-sycl.cpp

joeatodd committed 342 days ago

ded54b5d

Revert "use the correct SYCL context for host USM allocations"

joeatodd committed 343 days ago

18133cab

joeatodd committed 343 days ago

abd7c7b8

[SYCL] Update unsupported ops

joeatodd committed 343 days ago

0c0f3f00

[SYCL] unify rope norm/neox

joeatodd committed 343 days ago

9b81b572

tests : add non-cont unary tests (#7857)

ggerganov committed 344 days ago

Verified a9cae480

ggml : improve ggml_is_contiguous logic (#7856)

ggerganov committed 344 days ago

Verified bfaa676b

server : restore numeric prompts (#7883)

ggerganov committed 344 days ago

Verified 704a35b1

update intel docker oneapi-basekit to 2024.1.1-devel-ubuntu22.04 (#7894)

airMeng committed 344 days ago

Verified dcf75270

Fix a typo and add Fedora 40 pacakge to install for Vulkan (#7794) [no ci]

metal3d committed 345 days ago

Verified f2b5764b

vulkan: select only one device for single gpu with multiple drivers (#7582)

Adriankhl committed 345 days ago

Verified 73bac2b1

Update Vulkan RoPE implementation (#7818)

0cc4m committed 345 days ago

Verified ef52d1d1

fix broken link in pr template (#7880) [no ci]

deven367 committed 345 days ago

Verified 14f83526

github: move PR template to .github/ root (#7868)

mofosyne committed 345 days ago

Verified 6fe42d07

llama-bench: more compact markdown tables (#7879)

JohannesGaessler committed 345 days ago

Verified 148995e5

tests : check the Python version (#7872)

ggerganov committed 345 days ago

Verified 4bfe50f7

CUDA: int8 tensor cores for MMQ (q4_K, q5_K, q6_K) (#7860)

JohannesGaessler committed 345 days ago

Verified bdcb8f42

fix CUDA CI by using a windows-2019 image (#7861)

slaren committed 345 days ago

Verified c2ce6c47

json: refine constraint for whitespace to avoid runaways yet allow pretty print (#7866)

ochafik committed 346 days ago

Verified b61eb964

`json`: document schema conversion in GBNF readme, align manual grammar examples & converters (#7841)

ochafik committed 346 days ago

Verified 396b18df

cmake : fix CMake requirement for CUDA (#7821)

cebtenzzre committed 346 days ago

Verified 864a99e7

ci : try win-2019 on server windows test (#7854)

slaren committed 346 days ago

Verified fd5ea0f8

examples : remove --instruct remnants (#7846)

ggerganov committed 346 days ago

Verified c28a8390

server : improve "prompt" handling (#7847)

ggerganov committed 346 days ago

Verified d9da0e49

Older