[vulkan] Do not populate unpacked args of PackedContexts when deserializing (#83587)
Vulkan ops that use `PackedContext` objects currently maintain two lists storing the parameters of the op:
1. `unpacked_` which stores the original arguments passed in to the op
2. `packed_` which stores pre-processed arguments which are used for inference.
The `unpacked_` list is only needed for serialization - during inference, where it is not expected that the model will be saved, then there is no point keeping the `unpacked_` list in memory.
This diff introduces a flag `fill_unpacked`, by default set to `true`, that is passed into the `*PackedContext()` constructors. `unpacked_` is populated only if `fill_unpacked = true`.
The `create_*_context()` functions will call the constructor with `fill_unpacked = true`, which ensures that `unpacked_` is populated for serialization.
However, when loading a model, the `*PackedContext` objects are deserialized by calling `*PackedContext::pack()`, which will call the constructor with `fill_unpacked = false` - the original tensors will therefore be discarded after packing, saving a significant amount of CPU memory during model inference.
Differential Revision: [D38761645](https://our.internmc.facebook.com/intern/diff/D38761645/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83587
Approved by: https://github.com/kimishpatel