llama : support attention bias on LLaMA architecture (#4283)
* Support attention_bias on LLaMA architecture
QKVO bias, should fix InternLM (https://github.com/ggerganov/llama.cpp/issues/3133) and works for LLaMAfied Qwen models (https://github.com/ggerganov/llama.cpp/pull/3743#issuecomment-1825923608).
* check existence of qkvo bias while loading llama models
Tested on LLaMA2, CUDA and CPU.
* Update llama.cpp