pytorch
dab5e2a2 - [cuDNN v8 API] cuDNN benchmark, convolution bwd / transposed convolution fwd, `bfloat16`, conv-bias-activation fusion (#60755)

Commit View On GitHub

Commit

2 years ago

[cuDNN v8 API] cuDNN benchmark, convolution bwd / transposed convolution fwd, `bfloat16`, conv-bias-activation fusion (#60755) Summary: https://github.com/pytorch/pytorch/issues/58414, https://github.com/pytorch/pytorch/issues/58859, https://github.com/pytorch/pytorch/issues/58858 #58860 https://github.com/pytorch/pytorch/issues/58861 We're currently testing performance with both "find" and "get" with this PR. CC zasdfgbnm ptrblck ngimel puririshi98 In addition to the `USE_EXPERIMENTAL_CUDNN_V8_API` build flag, we've added a `CUDNN_V8_API_ENABLED` runtime feature flag. `USE_EXPERIMENTAL_CUDNN_V8_API=1` will build with v8 API support while keeping all v7 functionality, with v8 usage disabled by default. `CUDNN_V8_API_ENABLED=1` at runtime on a `USE_EXPERIMENTAL_CUDNN_V8_API=1` build uses the v8 API. A debug flag `CUDNN_V8_API_DEBUG=1` can be used to verify which API is used when dispatching convolutions. Note that in v7, `bfloat16` convolutions will dispatch to a native PyTorch implementation, but a fully v8 enabled build will dispatch to cuDNN implementations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60755 Reviewed By: mruberry Differential Revision: D34393940 Pulled By: ngimel fbshipit-source-id: 5c317d3aad63336ea416a51a43cf8b7d27aaca21 (cherry picked from commit 3bfc549ce57cee691f83dc894ac7adb4b7882459)

References

#74332 - Merge master into lazy_tensor_staging

Author

eqy

Committer

pytorchmergebot

Parents

e7051939

pytorch dab5e2a2 - [cuDNN v8 API] cuDNN benchmark, convolution bwd / transposed convolution fwd, `bfloat16`, conv-bias-activation fusion (#60755)

Commit

pytorch
dab5e2a2 - [cuDNN v8 API] cuDNN benchmark, convolution bwd / transposed convolution fwd, `bfloat16`, conv-bias-activation fusion (#60755)