Direct use python numpy array's memory if already contiguous. (#2355)
* Direct use python numpy array's memory if already contiguous. This
could greatly improve performance for session with large input,
like big image 1920x1080 fastrcnn, 30~40% speed up could be achieved.
* Add test case enforce contiguous/non-contiguos numpy array as inputs.