[UR][Offload] Fixes for enqueue UR CTS tests (#19926)
A small selection of fixes to increase the pass rate of the enqueue CTS
unit tests:
* Blocking memory reads/writes now properly wait on the queue.
* `urKernelSetArgMemObj` added to the function table.
* Debug print removed.
* Layout of kernel arguments now matches the HIP target if Offload is on
an AMD device.
* `urEnqueueEventsWaitWithBarrierExt` has been implemented (it just
calls to the non-ext version).
* `UR_DEVICE_INFO_TIMESTAMP_RECORDING_SUPPORT_EXP` set to false.