Use kernels from the kernel hub #2988
Narsil
commented
on 2025-02-03
danieldk
force pushed
from
dd890e7d
to
fb750344
321 days ago
danieldk
force pushed
from
fb750344
to
e01578d2
320 days ago
danieldk
force pushed
from
e01578d2
to
27decc55
320 days ago
danieldk
force pushed
from
fac14af6
to
3726ab75
319 days ago
danieldk
marked this pull request as ready for review 319 days ago
Use Hub kernels for Marlin and cutlass quantization kernels
aab6141b
Use hub kernels for MoE/GPTQ-Marlin MoE
758ff3c5
Use attention kernels from the Hub
b267caa5
Cache the kernels in the Docker image
c9191f3f
Update moe kernels
b35ab54f
Support loading local kernels for development
d39f896c
Support latest moe kernels
c1a564e7
Update to moe 0.1.1
dcb37316
CI: download locked kernels for server tests
a60d1e61
Fixup some imports
f25a7aad
CI: activate venv
00af6ef7
Fix unused imports
ca1067f9
Nix: add attention/moe/quantization kernels
4c8ced28
Update hf-kernels to 0.1.5
e0384974
Update kernels
520420a2
Update tgi-nix flake for hf-kernels
371668ee
Fix EOF
875ce6d5
Take `load_kernel` out of a frequently-called function
f74a50d4
Hoist another case of kernel loading out of a somewhat hot function
8aecc59e
marlin-kernels -> quantization
8ad383c7
attention -> paged-attention
96a4d4d0
EOF fix
219b8b16
Update hf-kernels, fixup Docker
8ae7bc38
danieldk
force pushed
from
44eb8257
to
8ae7bc38
317 days ago
ipex fix
df582a18
Remove outdated TODO
fc3ac807
Narsil
approved these changes
on 2025-02-10
danieldk
merged
571ac9b5
into main 314 days ago
danieldk
deleted the kernel-hub branch 314 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub