Improve ScopedContext nullptr handling and re-enable CUDA multi-GPU test
- Fix ScopedContext to handle nullptr Device without throwing
- Only restore context in destructor if original was non-null
- Remove CUDA skip from urEnqueueKernelLaunchIncrementMultiDeviceTest
- Remove unnecessary P2P support check (P2P is optimization, not requirement)