text-generation-inference
Use kernels from the kernel hub
#2988
Merged

Use kernels from the kernel hub #2988

danieldk merged 25 commits into main from kernel-hub
danieldk
Narsil
Narsil commented on 2025-02-03
danieldk danieldk force pushed from dd890e7d to fb750344 321 days ago
danieldk danieldk force pushed from fb750344 to e01578d2 320 days ago
danieldk danieldk force pushed from e01578d2 to 27decc55 320 days ago
danieldk danieldk force pushed from fac14af6 to 3726ab75 319 days ago
danieldk
danieldk commented on 2025-02-05
danieldk
danieldk commented on 2025-02-05
danieldk
danieldk commented on 2025-02-05
danieldk danieldk marked this pull request as ready for review 319 days ago
danieldk Use Hub kernels for Marlin and cutlass quantization kernels
aab6141b
danieldk Use hub kernels for MoE/GPTQ-Marlin MoE
758ff3c5
danieldk Use attention kernels from the Hub
b267caa5
danieldk Cache the kernels in the Docker image
c9191f3f
danieldk Update moe kernels
b35ab54f
danieldk Support loading local kernels for development
d39f896c
danieldk Support latest moe kernels
c1a564e7
danieldk Update to moe 0.1.1
dcb37316
danieldk CI: download locked kernels for server tests
a60d1e61
danieldk Fixup some imports
f25a7aad
danieldk CI: activate venv
00af6ef7
danieldk Fix unused imports
ca1067f9
danieldk Nix: add attention/moe/quantization kernels
4c8ced28
danieldk Update hf-kernels to 0.1.5
e0384974
danieldk Update kernels
520420a2
danieldk Update tgi-nix flake for hf-kernels
371668ee
danieldk Fix EOF
875ce6d5
danieldk Take `load_kernel` out of a frequently-called function
f74a50d4
danieldk Hoist another case of kernel loading out of a somewhat hot function
8aecc59e
danieldk marlin-kernels -> quantization
8ad383c7
danieldk attention -> paged-attention
96a4d4d0
danieldk EOF fix
219b8b16
danieldk Update hf-kernels, fixup Docker
8ae7bc38
danieldk danieldk force pushed from 44eb8257 to 8ae7bc38 317 days ago
Narsil
danieldk ipex fix
df582a18
danieldk Remove outdated TODO
fc3ac807
danieldk danieldk requested a review from Narsil Narsil 316 days ago
danieldk
Narsil
Narsil approved these changes on 2025-02-10
danieldk danieldk merged 571ac9b5 into main 314 days ago
danieldk danieldk deleted the kernel-hub branch 314 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone