Use kernels from the kernel hub #2988
Narsil
commented
on 2025-02-03
danieldk
force pushed
from
dd890e7d
to
fb750344
1 year ago
danieldk
force pushed
from
fb750344
to
e01578d2
1 year ago
danieldk
force pushed
from
e01578d2
to
27decc55
1 year ago
danieldk
force pushed
from
fac14af6
to
3726ab75
1 year ago
danieldk
marked this pull request as ready for review 1 year ago
Use Hub kernels for Marlin and cutlass quantization kernels
aab6141b
Use hub kernels for MoE/GPTQ-Marlin MoE
758ff3c5
Use attention kernels from the Hub
b267caa5
Cache the kernels in the Docker image
c9191f3f
Update moe kernels
b35ab54f
Support loading local kernels for development
d39f896c
Support latest moe kernels
c1a564e7
Update to moe 0.1.1
dcb37316
CI: download locked kernels for server tests
a60d1e61
Fixup some imports
f25a7aad
CI: activate venv
00af6ef7
Fix unused imports
ca1067f9
Nix: add attention/moe/quantization kernels
4c8ced28
Update hf-kernels to 0.1.5
e0384974
Update kernels
520420a2
Update tgi-nix flake for hf-kernels
371668ee
Fix EOF
875ce6d5
Take `load_kernel` out of a frequently-called function
f74a50d4
Hoist another case of kernel loading out of a somewhat hot function
8aecc59e
marlin-kernels -> quantization
8ad383c7
attention -> paged-attention
96a4d4d0
EOF fix
219b8b16
Update hf-kernels, fixup Docker
8ae7bc38
danieldk
force pushed
from
44eb8257
to
8ae7bc38
1 year ago
ipex fix
df582a18
Remove outdated TODO
fc3ac807
Narsil
approved these changes
on 2025-02-10
danieldk
merged
571ac9b5
into main 1 year ago
danieldk
deleted the kernel-hub branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub