Add methods to write image tensor content to buffer (#27359)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27359
Adding methods to TensorImageUtils:
```
bitmapToFloatBuffer(..., FloatBuffer outBuffer, int outBufferOffset)
imageYUV420CenterCropToFloat32Tensor(..., FloatBuffer outBuffer, int outBufferOffset)
```
To be able to
- reuse FloatBuffer for inference
- to create batch-Tensor (contains several images/bitmaps)
As we reuse FloatBuffer for example demo app - image classification,
profiler shows less memory allocations (before that for every run we created new input tensor with newly allocated FloatBuffer) and ~-20ms on my PixelXL
Known open question:
At the moment every tensor element is written separatly calling `outBuffer.put()`, which is native call crossing lang boundaries
As an alternative - to allocation `float[]` on java side and fill it and put it in `outBuffer` with one call, reducing native calls, but increasing memory allocation on java side.
Tested locally just eyeballing durations - have not noticed big difference - decided to go with less memory allocations.
Will be good to merge into 1.3.0, but if not - demo app can use snapshot dependencies with this change.
PR with integration to demo app:
https://github.com/pytorch/android-demo-app/pull/6
Test Plan: Imported from OSS
Differential Revision: D17758621
Pulled By: IvanKobzarev
fbshipit-source-id: b4f1a068789279002d7ecc0bc680111f781bf980