Add check that tensor sizes match in DataTransferManager::CopyTensors (#27008)
### Description
<!-- Describe your changes. -->
Add check that tensor sizes match in DataTransferManager::CopyTensors
before calling the IDataTransfer implementation so that the check is
done in one place.
We check the sizes match in DataTransferManager::CopyTensor[Async] so
this makes things consistent when a batched copy is done.
It is not required for DataTransferManager::CopySparseTensors. The
default implementation of IDataTransfer::CopySparseTensors is not
overridden by any EP so all sparse tensor copies (single or batched) end
up going via SparseTensor::Copy which has size checks.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
TRT RTX had a bug and was returning an output value that was an
incorrect size. When pre-allocated outputs on a different device were
provided we hit DataTransferManager::CopyTensors which had no check the
sizes matched, leading to a heap checker violation.