Enhance query APIs for text generation (#4965)
This PR was authored to improve efficiency using DeepSpeed-FastGen.
DeepSpeed-FastGen queries states of KV cache very frequently. Thus, this
PR adds an API for the query and improves the efficiency of some query
APIs.
This PR also allows skipping the schedulability check in case it is
already checked.
---------
Co-authored-by: Heyang Qin <heyangqin@microsoft.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>