pytorch
db36d21f - Add SDPA pattern for HuggingFace models BF16 (#121202)

Commit View On GitHub

Commit

197 days ago

Add SDPA pattern for HuggingFace models BF16 (#121202) ### Description - Add pattern for bf16 input type with fp32 attention mask. (Example model: ElectraForCausalLM) - Add pattern with batch_size=1 to avoid some clones in graph. (Example model: text-classification+prajjwal1-bert-tiny) ### Newly matched models Dtype: bf16, machine: SPR #### Dynamo HuggingFace models - ElectraForCausalLM (speedup=2.09x) - ElectraForQuestionAnswering (speedup=4.22x) - AlbertForQuestionAnswering (speedup=1.36x) - AlbertForMaskedLM (speedup=1.39x) #### OOB HuggingFace models - multiple-choice+google-electra-base-discriminator - text-classification+prajjwal1-bert-tiny - text-classification+prajjwal1-bert-mini - text-classification+google-electra-base-generator - text-classification+bert-large-cased - casual-language-modeling+xlm-roberta-base - text-classification+roberta-base - text-classification+xlm-roberta-base - text-classification+albert-base-v2 - token-classification+google-electra-base-generator - masked-language-modeling+bert-base-cased Pull Request resolved: https://github.com/pytorch/pytorch/pull/121202 Approved by: https://github.com/jgong5, https://github.com/jansel

Author

Valentine233

Committer

pytorchmergebot

Parents

953c6c37

pytorch db36d21f - Add SDPA pattern for HuggingFace models BF16 (#121202)

Commit

pytorch
db36d21f - Add SDPA pattern for HuggingFace models BF16 (#121202)