pytorch
790763b0 - Add an option to disable reduced precision reductions for FP16 GEMM (#67946)

Commit View On GitHub

Commit

2 years ago

Add an option to disable reduced precision reductions for FP16 GEMM (#67946) Summary: https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = ` rather than making it the default behavior. CC ngimel ptrblck stas00 Note that the behavior after the previous PR can be replicated with `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False` Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946 Reviewed By: zou3519 Differential Revision: D32289896 Pulled By: ngimel fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe

References

#68130 - Merge master

Author

eqy

Committer

facebook-github-bot

Parents

078c6559

pytorch 790763b0 - Add an option to disable reduced precision reductions for FP16 GEMM (#67946)

Commit

pytorch
790763b0 - Add an option to disable reduced precision reductions for FP16 GEMM (#67946)