openvino
85e9d29a - [GPU] Add a fused kernel groupnorm implementation (#33738)

Commit
32 days ago
[GPU] Add a fused kernel groupnorm implementation (#33738) ### Details: - Add a new OCL implementation for fsv16 group normalization - The new implementation is used if each group contains fewer than fsv=16 features - A single fused kernel handles all stages of the reduction, avoiding excessive loading of shared values and reusing cache in cases of small inputs ### Tickets: - CVS-177816 --------- Co-authored-by: Roman Lyamin <Roman.Lyamin@intel.com>
Parents
Loading