pytorch
38da54e9 - Split rnn primitive for inference and training (#96736)

Commit
2 years ago
Split rnn primitive for inference and training (#96736) ## Description Currently, both inference and training will use `forward_training` in rnn primitive, which will bring performance downgrade for inference (The performance drop is from rnn primitive and unnecessary creation of `pd` and `workspace`). This PR is to split them into `forward_inference` and `forward_training` seperately. ## Performance With this fix PR, in RNN-T inference, the throughput reduction is 167 ms, which increases `3.7%` of E2E time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96736 Approved by: https://github.com/jgong5
Author
Committer
Parents
Loading