llama.cpp
afcda09d
- vocab : fix HybridDNA tokenizer (#23466)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
44 days ago
vocab : fix HybridDNA tokenizer (#23466) * vocab : mark hybriddna k-mers to avoid BPE token collisions * improved loop --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
References
#23466 - vocab : keep DNA k-mer ids distinct from colliding BPE tokens
Author
kashif
Parents
bbce619a
Loading