llama.cpp
CUDA: add fused rms norm
#14800
Merged

Commits
  • CUDA: add fused rms norm
    am17an committed 251 days ago
  • assume mul_ptr is not null when calling fused ops, formatting changes
    am17an committed 251 days ago
  • Replace mul_ptr with mul
    am17an committed 251 days ago
  • Use mul tensor for broadcast
    am17an committed 251 days ago
  • Add testcase about the broadcast
    am17an committed 251 days ago
  • Fix test print
    am17an committed 251 days ago
  • Fix condition for broadcast
    am17an committed 251 days ago
Loading