[ONNX] Support constant tensors in FakeMode exporting (#107836)
Fixes https://github.com/pytorch/pytorch/issues/107475
- Constant tensors was wrongly recognized as weights and buffers, and then was detached from its default value during `to_model_proto`. This PR fixes the bug and pick up Bloom CI test back successfully. NOTE: non-persistent buffer and weights has different situation and is not fixed by this PR.
- Reduce transformers model size by modifying their config parameters to speed up CI tests. (Unrelated to this PR title)
Corresponding change with https://github.com/microsoft/onnxscript/pull/1023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107836
Approved by: https://github.com/BowenBao, https://github.com/justinchuby