gh-37195: general LU: gather several calls to field inversion into a single call
The same k-th diagonal element was inverted many times in a loop over
some index j. These inversions can be grouped just before the loop.
Some rough measurements, see below, show a time gain from about nothing
up to about 20%.
```
import time
fields = [QQ, GF(next_prime(2**20)), GF(next_prime(2**1000)), GF(5**10),
GF(17**30)]
fieldname = list(range(len(fields)))
for i in range(len(fields)):
for dim in [5,20,100,200]:
t = 0.0
nb_iter = ceil(1000000/(dim*dim*dim))
for k in range(nb_iter):
mat = matrix.random(fields[i], dim, dim)
tt = time.time()
mat.LU()
t += time.time() - tt
print(f"field{fieldname[i]},{dim} --> {t/nb_iter}")
## BEFORE CHANGES:
# field0,5 --> 0.0000980921089649200
# field0,20 --> 0.00224924087524414
# field0,100 --> 0.434456348419189
# field0,200 --> 6.88138055801392
# field1,5 --> 0.0000831986367702484
# field1,20 --> 0.000737508773803711
# field1,100 --> 0.0733184814453125
# field1,200 --> 0.595556974411011
# field2,5 --> 0.000171837747097015
# field2,20 --> 0.00316344833374023
# field2,100 --> 0.278676033020020
# field2,200 --> 2.28821945190430
# field3,5 --> 0.000185962766408920
# field3,20 --> 0.00463068389892578
# field3,100 --> 0.592973470687866
# field3,200 --> 5.39039492607117
# field4,5 --> 0.000374413549900055
# field4,20 --> 0.0115888252258301
# field4,100 --> 1.32473826408386
# field4,200 --> 10.2965865135193
## AFTER CHANGES:
#field0,5 --> 0.000100919753313065
#field0,20 --> 0.00213212776184082
#field0,100 --> 0.436735391616821
#field0,200 --> 6.47587561607361
#field1,5 --> 0.0000834324955940247
#field1,20 --> 0.000731405258178711
#field1,100 --> 0.0689191818237305
#field1,200 --> 0.586312770843506
#field2,5 --> 0.000148962318897247
#field2,20 --> 0.00227311897277832
#field2,100 --> 0.248369216918945
#field2,200 --> 1.96490478515625
#field3,5 --> 0.000177360057830811
#field3,20 --> 0.00400590515136719
#field3,100 --> 0.602026939392090
#field3,200 --> 4.80513262748718
#field4,5 --> 0.000303080528974533
#field4,20 --> 0.00903463745117188
#field4,100 --> 1.11252427101135
#field4,200 --> 9.27905511856079
```
URL: https://github.com/sagemath/sage/pull/37195
Reported by: Vincent Neiger
Reviewer(s): Frédéric Chapoton, Vincent Delecroix, Vincent Neiger