DeepSpeed
3d097bb8
- Extend scratch buffer for long prompts (#2212)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
2 years ago
Extend scratch buffer for long prompts (#2212) Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
References
#2212 - Extend scratch buffer for long prompts
Author
cmikeh2
Parents
b76e0f4f
Files
8
csrc/transformer/inference
csrc
apply_rotary_pos_emb.cu
dequantize.cu
gelu.cu
pt_binding.cpp
transform.cu
includes
inference_context.h
inference_cuda_layers.h
deepspeed/ops/transformer/inference
transformer_inference.py
Loading