43054: Add Siglip2Tokenizer to enforce training-time text preprocessing defaults (#43101)
* Add Siglip2Tokenizer with lowercase normalizer and tests
* updated correct gemmatokenizer reference
* Updated lowercase logic to keep it simple
* added inegration test and removed _unk_id from tokenization
* ruff fixes
* updated the doc
* Updated doc and conversion script
* fix modular to skip type hinting when inheriting
* style
---------
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>