GGUF: expose header metadata without materializing tensors
load_gguf_checkpoint now computes `tensor_quant_types` ({name: quant_type}) and
`weight_mapping` unconditionally — they are read straight off the GGUF header
(no tensor data), so a `return_tensors=False` call returns them cheaply. Only the
eager `np.copy` of tensor bytes stays behind `return_tensors=True`.
This lets the module-swap plan be built from metadata + renamings alone (pure
name resolution, no tensor load / no conversion). Verified: return_tensors=False
yields 291 quant types + 12 rules with no `tensors`; full load and AutoConfig via
gguf unchanged; 63 fast tests pass.