[LTC] Lower convolution_overrideable to _convolution
Currently convolution_overrideable is implemented as a special fallback
due to the implementation limitations of convolution in Eager, which
brings two problems, 1) graph fragementation 2) fallback convolution
kernel may not be using the most efficient version for that device.
This PR fixes the forward convolution by lowering it to _convolution,
and it is verified to work for CPU and CUDA.