Fix lint and test failures for empty tensor reductions
- Fix clang-format: remove extra trailing spaces before comments
- Fix WebGPU shader: replace bitcast<f32>(0xff800000u) with float
literals (-3.4028234663852886e+38f / +3.4028234663852886e+38f).
The bitcast approach failed because output_value_t may be f16,
causing a WGSL type mismatch in the shader module.
- Add DML and NNAPI to kEmptyTensorExcludedEps: these EPs don't
support empty tensor reductions.
Signed-off-by: Justin Chu <justinchu@microsoft.com>