Broadcast binary ops involving strided triangular (#55798)
Currently, we evaluate expressions like `(A::UpperTriangular) +
(B::UpperTriangular)` using broadcasting if both `A` and `B` have
strided parents, and forward the summation to the parents otherwise.
This PR changes this to use broadcasting if either of the two has a
strided parent. This avoids accessing the parent corresponding to the
structural zero elements, as the index might not be initialized.
Fixes https://github.com/JuliaLang/julia/issues/55590
This isn't a general fix, as we still sum the parents if neither is
strided. However, it will address common cases.
This also improves performance, as we only need to loop over one half:
```julia
julia> using LinearAlgebra
julia> U = UpperTriangular(zeros(100,100));
julia> B = Bidiagonal(zeros(100), zeros(99), :U);
julia> @btime $U + $B;
35.530 μs (4 allocations: 78.22 KiB) # nightly
13.441 μs (4 allocations: 78.22 KiB) # This PR
```