py : fix missing added_tokens_dict for SPM and BPE vocabs (#4971)
* py : fix missing added_tokens_dict for SPM vocab
* py : pad with unknown tokens when data is missing
ggml-ci
* py : fix BPE vocab conversion
ggml-ci
* py : fix padded dummy tokens (I hope)