Fix fusion for two LayerNorm sharing same input but with different weights (#15919)
in gpt_j_residual(https://arxiv.org/pdf/2204.06745.pdf), there are 2 LN
nodes will share one same input, and ORT does CSE graph optimization
before LN fusion, which will modify the LN graph pattern and thus make
LN fusion failure.
