:rotating_light: Unify 3D position ids (#43972)
* draft before I lose it
* dump and come back after position ids PR is merged
* fix fast tests
* let qwen-vl return `mm-token-type-ids` always!
* fix repo
* fix fast tests
* this should be it!
* fix style
* oops fix this one is swapped
* glm uses the same token id for images and videos, workaround by checking start-end tokens
* oh no, modular deleted my fix
* add docs
* fix processor