clip : Add Qwen2.5VL support (#12402)

Commit

1 year ago

clip : Add Qwen2.5VL support (#12402) * implment vision model architecture, gguf convertor * handle window attention inputs * add debug utils * fix few incorrect tensor memory layout * move position id remap out of ggml to avoid int32 cuda operations * cleaning up * ignore transformers Qwen2_5_xxx type check * remove not so often use `qwen2vl-cli` debug functions * remove commented-out code blocks * fix attn weight scaling after rebase * add `PROJECTOR_TYPE_QWEN2_5_VL` * remove `KEY_USE_GLU_MLP`, `KEY_USE_RMS_NORM` * replace `KEY_FULLATTN_BLK_IDX` with `KEY_WIN_ATTN_PATTERN` * remove `attn_window_size` from gguf * fix model conversion * clean up * fix merging problem * add test --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

References

#12402 - Add Qwen2.5VL support

Author

HimariO

Parents

2d451c80

llama.cpp ca2bb89e - clip : Add Qwen2.5VL support (#12402)

llama.cpp
ca2bb89e - clip : Add Qwen2.5VL support (#12402)