[Functionalization] Lower `masked_fill.Tensor` and `masked_fill.Scalar` ops (#4616)
* Lower masked_fill.Scalar and masked_fill.Tensor to fix related cpp tests
* Remove in-place versions for masked_fill
* Clean-up some code
* Update tensor_methods::masked_fill to expand input tensor if needed
* Add check to expand only if the rank of the input tensor is less than that of the mask tensor
* Update tensor rank comparison if condition
* Enable KlDivBackward cpp test