pytorch
8b87f9a5 - Add fused layer norm impl on CUDA in PyTorch (#27634)

Commit
5 years ago
Add fused layer norm impl on CUDA in PyTorch (#27634) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27634 Add fused layer norm impl on CUDA in PyTorch Performance benchmark compare to apex.FusedLayerNorm on a V100 machine. ************************************** Shape = (128, 2097152) curr LayerNorm forward: 7.252584544941783ms apex LayerNorm forward: 10.366813436849043ms curr LayerNorm backward: 15.568048988003284ms apex LayerNorm backward: 20.869979876093566ms ************************************** Shape = (256, 1048576) curr LayerNorm forward: 5.185673736967146ms apex LayerNorm forward: 6.3868385690730065ms curr LayerNorm backward: 13.942008479032665ms apex LayerNorm backward: 15.469660016940907ms ************************************** Shape = (512, 524288) curr LayerNorm forward: 4.672068868065253ms apex LayerNorm forward: 4.717993081081659ms curr LayerNorm backward: 13.46354596503079ms apex LayerNorm backward: 14.04774487693794ms ************************************** Shape = (1024, 262144) curr LayerNorm forward: 4.547273400006816ms apex LayerNorm forward: 5.378365494078025ms curr LayerNorm backward: 13.425063178874552ms apex LayerNorm backward: 14.235145597020164ms ************************************** Shape = (2048, 131072) curr LayerNorm forward: 4.526399010093883ms apex LayerNorm forward: 4.775081946980208ms curr LayerNorm backward: 13.222738380078226ms apex LayerNorm backward: 13.59594238596037ms ************************************** Shape = (4096, 65536) curr LayerNorm forward: 4.28789056581445ms apex LayerNorm forward: 4.48913648002781ms curr LayerNorm backward: 13.026655421825126ms apex LayerNorm backward: 13.57052089786157ms ************************************** Shape = (8192, 32768) curr LayerNorm forward: 4.243518367875367ms apex LayerNorm forward: 4.34588153520599ms curr LayerNorm backward: 13.140627697808668ms apex LayerNorm backward: 13.49891544203274ms ************************************** Shape = (16384, 16384) curr LayerNorm forward: 4.181216162163764ms apex LayerNorm forward: 4.268723972840235ms curr LayerNorm backward: 13.035593512002379ms apex LayerNorm backward: 13.463351831072941ms ************************************** Shape = (32768, 8192) curr LayerNorm forward: 4.097899778978899ms apex LayerNorm forward: 4.109480210812762ms curr LayerNorm backward: 13.041268918896094ms apex LayerNorm backward: 13.586135944118723ms Test Plan: buck test mode/dev-nosan caffe2/test:nn -- "LayerNorm" Reviewed By: houseroad Differential Revision: D17462420 fbshipit-source-id: d4a67d160bb4eff73ffac64af46c56c3845cf211
Author
Parents
Loading