Move `original_max_position_embeddings` to rope params (#42513)
* move `original_max_position_embeddings` to rope param dict and resolve TODs from Joao
* bring back truncate in yarn
* move the patch to `standardize` helper, this one gets called every time we init rope comute fn
* my bad
* silly typo, I should read the code I write!
* force the tester to use specific layer types, because rope is built with these types
* revert, whyhow did it get deleted?!
* factor isn't guaranteed to exist in the end
* tiny test issue, needs to standardize first