llama.cpp
bde188d6 - metal: TRI, FILL, EXPM1, SOFTPLUS (#16623)

Commit
10 days ago
metal: TRI, FILL, EXPM1, SOFTPLUS (#16623) * feat(wip): Port initial TRI impl from pervious work The kernel does not work and is not optimized, but the code compiles and runs, so this will be the starting point now that the core op has been merged. Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Remove argument for constant val override This was added in the original draft, but later removed. With this, the kernel now passes tests. Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Move the ttype conditional to templating to avoid conditional in kernel Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Type fixes Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * feat: Add softplus for metal Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Add EXPM1 for metal Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Add FILL for metal Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * refactor: Branchless version of tri using _ggml_vec_tri_cmp as a mask Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Remove unused arguments Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * refactor: Use select instead of branch for softplus non-vec Branch: ggml-cumsum-tri Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading