Add local attention for GPT-Neo model architecture (#1114)
* fix links for inference tutorial
* Fix automatic injection. Add the local-attention for GPT-Neo
* fix the inference for generation of large sequences (>1K & <32K)
* fix format
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>