Selective dequantization (#5375)

Commit

2 years ago

Selective dequantization (#5375) This PR adds a new functionality for the dequantizer function, called `selective_dequantize`, which enables partially dequantizing a 3-dimensional matrix in case we don't need to dequantize all the data from lower bit (like fp8/fp6) to bf16. I also added a unit test to check its functionality. --------- Co-authored-by: Reza Yazdani <reza.yazdani@snowflake.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

References

#5375 - Selective dequantization

Author

RezaYazdaniAminabadi

Parents

64defe65

DeepSpeed c632ea09 - Selective dequantization (#5375)

DeepSpeed
c632ea09 - Selective dequantization (#5375)