llama : implement Unigram tokenizer needed by T5 and FLAN-T5 model families (#5763)
* llama : add T5 model architecture, tensors and model header parameters
* llama : add implementation of Unigram tokenizer with SentencePiece-like text normalization using precompiled charsmap
---------
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>