[`GPT OSS`] Refactor the tests as it was not properly checking the outputs (#40288)
* it was long due!
* use the official kernel
* more permissive
* update the kernel as well
* mmm should it be this?
* up pu
* fixup
* Update test_modeling_gpt_oss.py
* style
* start with 20b