[`FSMT`] Make it compatible with `xxxForConditionalGeneration` models (#20825)
* add `get_encoder` and `get_decoder`
* add additional kwargs support
* fix condition
* add better checks
* better checks
* fix embed positions
* better test to consider padding
* fix debug statement
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add arguments on docstring
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>