Fix softmax backward (#2709)
* Reset KV-cache at the beginning of text-generation
* Add new backward kernel to handle large softmax-length
* remove unrelated changes
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Connor Holmes <connorholmes@microsoft.com>