whisper : make beam candidate sort more stable (#1943)
All else being otherwise equal, this encourages the beam candidate
selection to re-use the same decoder, which slightly
reduces the cache size.
I wouldn't expect it to make much of a performance difference,
but it helps when debug printing the cache and beam.
Added as part of understanding #1941.