Add fp8-fused gemm kernel (#5764)

Commit

1 year ago

Add fp8-fused gemm kernel (#5764) This PR adds the new fused kernel for the Dense GeMM using fp8-quantized weight. --------- Co-authored-by: Jeff Rasley <jeffra45@gmail.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>