CUDA: MoE helper in device code, better tile sizes #15525
CUDA: MoE helper in device code, better tile sizes
8bb55de6
reduce superfluous CUDA blocks
07c814b5
IMbackK
requested changes
on 2025-08-23
try AMD fix
e7b884da
IMbackK
approved these changes
on 2025-08-23
raise shared memory limit
57249900
reduce shared memory use
a2f702a9
add assert
1d609235
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub