Check if the buffers fit GPU memory after device map auto inferred (#2412)
* Check if the buffers fit GPU memory after device map auto inferred
* For some models, like TheBloke/WizardCoder-33B-V1.1-GPTQ, contain a
huge buffer, which may cause OOM on GPU memory if not using
offload_buffers. This commit adds a check for such case.
* Minor refactors.
* Add missing assertions