[Vulkan] Fix issues in GRU and LSTM (#83722)
Summary:
This diffs fixes several issues in GRU and LSTM vulkan ops:
- Add create_gru_context and create_lstm_context to vulkanFoldPrePackingOps
- Add filter to insertPrePackedGruOp and insertPrePackedLstmOp to avoid matching gru.data and lstm.data usages
- Fixed output dimension of GRU and LSTM
- Allowed batch_first to be false when batch=1 and seq=1
Test Plan:
Check that optimize_for_mobile runs and correctly folds the create context ops
```
buck run :export_for_mobile ~/ferraris/ferraris.ptl ~/ferraris
```
Check that vulkan api tests are still passing
```
buck run //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64
```
Reviewed By: SS-JIA
Differential Revision: D38811967
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83722
Approved by: https://github.com/SS-JIA