vllm
[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5)
#5136
Merged

[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) #5136

alexm-redhat
alexm-redhat1 year ago

A recently released PTX 8.5 (on May 9, 2024) introduced a new modifier, called ordered_metadata, for the mma.sp 2:4 warp sparse instruction. This modifier requires that the indices in the sparsity metadata are sorted in an increasing order starting from the LSB bit. In our case, in format_24.py lines 96-102, we already have this requirement met due to the following encodings:

    # Encoding quadruples of True/False values as follows:
    #     [True,  True,  False, False] -> 0b0100
    #     [True,  False, True,  False] -> 0b1000
    #     [False, True,  True,  False] -> 0b1001
    #     [True,  False, False, True ] -> 0b1100
    #     [False, True,  False, True ] -> 0b1101
    #     [False, False, True,  True ] -> 0b1110 

Therefore, we can simply add the new modifier and nothing else is required.

alexm-redhat marlin_24: Ensure the mma.sp instruction is using the ::ordered_metad…
71dc27c5
alexm-redhat clang-format
071bb64d
robertgshaw2-redhat
robertgshaw2-redhat approved these changes on 2024-05-30
robertgshaw2-redhat robertgshaw2-redhat enabled auto-merge (squash) 1 year ago
tlrmchlsmth
tlrmchlsmth approved these changes on 2024-05-30
tlrmchlsmth1 year ago

Thanks for the fix!

disabled auto-merge 1 year ago
Manually disabled by user
simon-mo simon-mo merged 6d21fa1c into main 1 year ago
simon-mo
simon-mo1 year ago

Hi @alexm-neuralmagic, I'm going to revert this PR because it doesn't compile on CUDA 12.1 and 11.8 and our release pipeline is currently stuck there. Sorry!

https://github.com/vllm-project/vllm/actions/runs/9311899829/job/25631836467#step:8:1873

ptxas /tmp/tmpxft_0000a2c3_00000000-9_marlin_24_cuda_kernel.compute_80.ptx, line 557; fatal   : Parsing error near ':': syntax error

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone