[CostModel] Mark all TTIImpls as final. NFC (#143404)
In the AArch64 version this helps reduce the number of blr instruction
(indirect jumps) in from 325 to 87, and reduces the size of the object
file by 4%. It seems to help make the code more efficient even if it
doesn't greatly affect compile time.
The AMDGPU variants are already marked as final.