DeepSpeed
3d097bb8 - Extend scratch buffer for long prompts (#2212)

Commit
2 years ago
Extend scratch buffer for long prompts (#2212) Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Author
Parents
  • csrc/transformer/inference
    • csrc
      • File
        apply_rotary_pos_emb.cu
      • File
        dequantize.cu
      • File
        gelu.cu
      • File
        pt_binding.cpp
      • File
        transform.cu
    • includes
      • File
        inference_context.h
      • File
        inference_cuda_layers.h
  • deepspeed/ops/transformer/inference
    • File
      transformer_inference.py