gptq_benchmark_update (#1420)

Commit

2 years ago

gptq_benchmark_update (#1420) * add_exllamav2 * style * fix doc * simplify script * style * update perplexity measure * Revert "Merge branch 'add_exllamav2' into update-benchmark-gptq" This reverts commit f2dbdc2ea13183c353dfa22135d2a7f401a3dbbb, reversing changes made to 216213e46e094de9d72614c09b058dceb1b35020. * Merge branch 'add_exllamav2' into update-benchmark-gptq * fix arg in llama attention * flash_attention arg * Revert "Merge branch 'add_exllamav2' into update-benchmark-gptq" This reverts commit 97a7c62b0cf09ad4671a4198958977143a1191cf. * update benchmark prefill and generate * replace by use_exllama_v2 * update benchmark arg * switch to a config_dict instead of disable_exllamav2 * Apply suggestions from code review Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> * better tests * style * style --------- Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

References

#1420 - gptq_benchmark_update

Author

SunMarc

Parents

9f5ab619

optimum 87fcf9f0 - gptq_benchmark_update (#1420)

optimum
87fcf9f0 - gptq_benchmark_update (#1420)