[pytorch/ops] Concat fast path w/ zero tensor (#46805)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46805
The current implementation goes with slow path if there is zero tensor in the list. This is inefficient. Use the fast path for torch.cat even if there are empty tensors. This wastes one thread block for the empty tensor, but still much better than the slow path.
Test Plan: CI + sandcastle
Reviewed By: ngimel
Differential Revision: D24524441
fbshipit-source-id: 529c8af51ecf8374621deee3a9d16cacbd214741