Apply review suggestions
Fix CU_MEMHOSTALLOC_DEVICEMAP → CU_MEMHOSTREGISTER_DEVICEMAP in the
CUDA adapter's allocateMemObjOnDeviceIfNeeded (wrong flag for
cuMemHostRegister).
Add HostPtrRegisteredByUR flag to HIP BufferMem to track whether UR
performed the hipHostRegister call. clear() now only calls
hipHostUnregister when UR owns the registration, preventing incorrect
unregistration of user-provided already-registered memory.