[CB] Minor perf improvements and ty compatibility (#43521)
* Better output
* Better name
* Better seqlen_k
* Add a TMP token when scheduling request
* Index selection of logits is in CG
* ty 1/n
* ty part 2/n
* ty part 3/n
* ty part 4/n
* ty part 5/n
* ty part 6/6
* style
* ty final touches
* No more _
* Make sure the TMP token never gets outputted
* more accurate msg
* Fix nit in overall script
* Nit
* style
* Removed useless attribute