Add channels-last support to bundled_inputs (#36764)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36764
This allows bundling inputs that are large uniform buffers in
channels-last memory format.
Test Plan: Unit test.
Differential Revision: D21142660
Pulled By: dreiss
fbshipit-source-id: 31bbea6586d07c1fd0bcad4cb36ed2b8bb88a7e4