Dont zero out buffers in dynamic linear (#27002)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27002
This was taking a significant amount of time in my benchmarks with larger output sizes (e.g. final output projection in a language classification model)
Test Plan: Imported from OSS
Differential Revision: D17641765
Pulled By: jamesr66a
fbshipit-source-id: b0ef30767eec9774fc503bb51fed039222026bba