pytorch
ca9f4222 - Inductor cpp wrapper: fix codegen of positional args with default value (#108552)

Commit View On GitHub

Commit

1 year ago

Inductor cpp wrapper: fix codegen of positional args with default value (#108552) Fixes https://github.com/pytorch/pytorch/issues/108323. Cpp wrapper has functionality regression on `llama` and `tnt_s_patch16_224` due to recent support of scaled dot product flash attention in inductor. The schema of this OP is as follows: ``` - func: _scaled_dot_product_flash_attention(Tensor query, Tensor key, Tensor value, float dropout_p=0.0, bool is_causal=False, bool return_debug_mask=False, *, float? scale=None) -> (Tensor output, Tensor logsumexp, Tensor cum_seq_q, Tensor cum_seq_k, int max_q, int max_k, Tensor philox_seed, Tensor philox_offset, Tensor debug_attn_mask) ``` For `llama` and `tnt_s_patch16_224`, the OP is called in the below way, where the three positional args with default values are not passed (`float dropout_p=0.0, bool is_causal=False, bool return_debug_mask=False`). ```python y = torch.ops.aten._scaled_dot_product_flash_attention.default(x0, x1, x2, scale = 0.125) ``` This PR fixes the cpp wrapper support for this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108552 Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel

Author

chunyuan-w

Committer

pytorchmergebot

Parents

60bd30ee

pytorch ca9f4222 - Inductor cpp wrapper: fix codegen of positional args with default value (#108552)

Commit

pytorch
ca9f4222 - Inductor cpp wrapper: fix codegen of positional args with default value (#108552)