Improve Windows ETW callback registration and fix issues (#24877)
### Description
- `EtwRegistrationManager`. Make sure all fields initialized by a
constructor
- Register a callback object instead of a pointer to it. Store it in the
map with a session unique key.
- Register `ML_Ort_Provider_Etw_Callback` once for all the sessions. The
first session registers, the last one to go away removes the callback to
Log all sessions. For this we make callbacks ref-counted inside the map
they are stored in. This is done to prevent a deadlock where
`active_sessions_mutex_` and `callback_mutex_` are acquired from
different threads in a different order.
- Create a registration guard to remove callbacks in case
`InferenceSession` constructor does not finish.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This PR is inspired by
https://github.com/microsoft/onnxruntime/issues/24773?reload=1.
Current code exhibits multiple issues.
- `EtwRegistrationManager` constructor does not initialize all of the
fields including the `InitializationStatus`.
- Global callback object is registered and re-created by every session.
Customers sometimes run thousands of models in the same sessions which
results in a quadratic ETW costs. The callback object is destroyed and
recreated every time a session is created.
- There is a chance that InferenceSession constructor does not finish,
and the callback would remain registered. This may result in
intermittent hard to diagnose bugs.
- `active_sessions_lock_` and `callback` lock are not acquired/released
in the same order by different threads which is a classic deadlock
scenario.