rollout: add FlashInfer kernel manager and vLLM compat shim
- Add FlashInferKernelManager context manager for swapping attention
and sampling kernels during decode (framework ready, activation
pending Blackwell FlashInfer support)
- Add _vllm_compat/ sitecustomize to handle duplicate template name
errors from vLLM 0.22+ Pydantic validation
- Add bench_flashinfer.py benchmark script with --graph-capture flag
- Refactor HybridEngineRollout.generate() to extract _dispatch_generate()
for cleaner flashinfer integration path
Signed-off-by: Guokai Ma <guokai.ma@intel.com>