benchmark
788869b8 - support unbacked-batch-only in torchbench (#172719)

Commit

12 days ago

support unbacked-batch-only in torchbench (#172719) Summary: support unbacked batch size in torchbench using the flag unbacked-batch-only this is same as dynamic-bactch-only but using unbacked dynamic shapes . summary of results ### Inference: Unbacked vs Backed - Compile & Runtime Comparison. Model | Runtime Speedup (Unbacked vs Backed) -- | -- DistillGPT2 | 0.98x GPT2ForSequenceClassification | 0.97x AlbertForMaskedLM | 0.93x AllenaiLongformerBase | 0.93x OPTForCausalLM | 0.93x BlenderbotForCausalLM | 0.89x MegatronBertForCausalLM | 0.86x MT5ForConditionalGeneration | 0.83x M2M100ForConditionalGeneration | 0.81x BartForCausalLM | 0.80x GoogleFnet | 0.78x ElectraForCausalLM | 0.77x MBartForCausalLM | 0.77x DebertaV2ForMaskedLM | 0.75x BertForMaskedLM | 0.75x XGLMForCausalLM | 0.75x YituTechConvBert | 0.73x T5ForConditionalGeneration | 0.73x T5Small | 0.72x PLBartForCausalLM | 0.71x RobertaForCausalLM | 0.71x LayoutLMForMaskedLM | 0.66x XLNetLMHeadModel | 0.63x TrOCRForCausalLM | 0.61x PegasusForCausalLM | 0.60x DistilBertForMaskedLM | 0.58x MobileBertForMaskedLM | 0.53x ### Trianing: Unbacked vs Backed - Compile & Runtime Comparison. Model | Unbacked is X Slower -- | -- GoogleFnet | ❌ FAILED M2M100ForConditionalGeneration | ❌ FAILED TrOCRForCausalLM | ❌ FAILED XGLMForCausalLM | ❌ FAILED XLNetLMHeadModel | ❌ FAILED MobileBertForMaskedLM | 29.5% slower DistilBertForMaskedLM | 27.9% slower ElectraForCausalLM | 23.2% slower LayoutLMForMaskedLM | 22.7% slower T5ForConditionalGeneration | 20.4% slower T5Small | 20.2% slower PegasusForCausalLM | 19.8% slower RobertaForCausalLM | 15.3% slower BertForMaskedLM | 12.8% slower MT5ForConditionalGeneration | 10.8% slower YituTechConvBert | 10.6% slower DebertaV2ForMaskedLM | 9.8% slower MBartForCausalLM | 8.9% slower PLBartForCausalLM | 8.7% slower BartForCausalLM | 8.6% slower AllenaiLongformerBase | 5.5% slower MegatronBertForCausalLM | 3.6% slower AlbertForMaskedLM | 2.9% slower DistillGPT2 | 1.3% slower BlenderbotForCausalLM | 0.5% slower OPTForCausalLM | 3.6% faster Compile speedup probably due to less recompilations. X-link: https://github.com/pytorch/pytorch/pull/172719 Approved by: https://github.com/aorenste Reviewed By: izaitsevfb Differential Revision: D92304453 fbshipit-source-id: 000a02e4ad7d2c2ca64097a44ca177932f719014

Author

laithsakka

laithsakka

Committer

meta-codesync[bot]

meta-codesync[bot]

Parents

Loading