pytorch
61dcde88 - Jiterator with Python Registration (#77121)

Commit View On GitHub

Commit

2 years ago

Jiterator with Python Registration (#77121) You can now do a lot of crazy things about redefining the behavior of an operator, and still be fast in cuda !!! Example 1: swapping where's branches ``` code_string = "template <typename T> T inverted_where(bool cond, T a, T b){ return !cond ? a : b; }" jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string) my_lib = torch.library.Library("aten", "IMPL") my_lib.impl('aten::where.self', jitted_fn, "CUDA") # torch.where is now overridden ``` Example 2: approximate gelu with relu ``` code_string = "template <typename T> T fast_gelu(T a){ return a > 0 ? a : 0;}" jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string) my_lib = torch.library.Library("aten", "IMPL") my_lib.impl('aten::gelu', jitted_fn, "CUDA") # torch.nn.GELU and torch.nn.function.gelu are now overridden ``` Example 3: clipping output for numerical unstable kernels ``` code_string = "template <typename T> T clipped_exp(T a){ return a > T(10.0) ? T(22026.4657948) : exp(a); }" jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string) my_lib = torch.library.Library("aten", "IMPL") my_lib.impl('aten::exp', jitted_fn, "CUDA") # torch.exp(x) and x.exp() are now overridden ``` Example 4: Simulate buggy hardware behaviors ``` code_string = "template <typename T> T buggy_add(T a, T b){ return a + b + T(1); }" jitted_fn = torch.cuda.jiterator._create_jit_fn(code_string) my_lib = torch.library.Library("aten", "IMPL") my_lib.impl('aten::add.Tensor', jitted_fn, "CUDA") torch.add(x, y), "x + y" and x.add(y) are now overridden ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77121 Approved by: https://github.com/anjali411

Author

SherlockNoMad

Committer

pytorchmergebot

Parents

00fb8282

pytorch 61dcde88 - Jiterator with Python Registration (#77121)

Commit

pytorch
61dcde88 - Jiterator with Python Registration (#77121)