Update special token handling in conversion scripts for gpt2 derived tokenizers (#3746)
We still have the heads up in `README.md` regarding `bpe` tokenizers and this patch is needed for
- a couple of tokenizer tests
- some more `special` and `non-special` added tokens handling (as far as I understand it)
* Update special token handling
* Add mpt