[FX lowering] Modify replace_all_uses_with to allowing filtering of nodes to update; use it to (#73763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73763
The test that is enabled generates a graph as such:
```
linear_25 --> sigmoid_14 --> output_1
\--> output_2
```
Before this diff, (unpadding) layout_transform nodes would be added as follows:
```
linear_25 --> layout_xform1 --> sigmoid_14 --> layout_xform2--> output_1
\--> output_2
```
This causes an assertion to fail for the sigmoid node where the input and output types
don't match due to padding differences.
This diff modifies the replacement algorithm to not affect users of an output's parent node
when the user requires padded inputs. This yields the following graph instead:
```
linear_25 --> sigmoid_14 --> layout_xform2--> output_1
\--> layout_xform1 --> output_2
```
Test Plan: Manually and CI
Reviewed By: jfix71, dborkovic
Differential Revision: D34623590
fbshipit-source-id: 3834b06c95fc5626eccc282216cbe039ac5a3242
(cherry picked from commit af012372ae1a6bb654b0ed9b765993960d5251e4)