SemanticDiff

pytorch
573d9e69 - Support Linear operation with fp16 weights in ATen (#22023)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

5 years ago

Support Linear operation with fp16 weights in ATen (#22023) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22023 This diff implements Linear operation with fp16 weights based on FBGEMM. At a hight level, we want to perform the following operation: Y = X * W + B with dtypes: (fp32, fp32, fp16, fp32) To do that, three steps are needed: 1. Quantize weights from fp32 to fp16, this is done using `PackedGemmMatrixFP16` in the `fbgemm_pack_gemm_matrix_fp16` 2. Conduct matrix multiplication with quantized weights using `cblas_gemm_compute` in `fbgemm_linear_fp16_weight` 3. Add bias to the result from step2 and return the final Y Reviewed By: jianyuh Differential Revision: D15921768 fbshipit-source-id: dc4e5b366f846ce9d58975876940a9b3372b8b8d

Author

mingzhe09088

mingzhe09088

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading