[GPU] Fix perf drop of 4bit-KV-cache of VLM (#35831)
some VLMs showed performance drop when 4bit KV-cache is enabled with PA
backend.
### Details:
When KV_CACHE_PRECISION=u4 is set globally, all non-PA SDPA nodes —
including Vision Encoder self-attention — are blocked from using the
fast sdpa_micro__prefill kernel and fall back to the slower
sdpa_opt__multi_reg kernel.
This happens because the check in sdpa_opt.cpp reads the global config
get_kv_cache_precision() without verifying whether the SDPA node
actually uses compressed KV cache
### Tickets:
- CVS-185922
### AI Assistance:
- *AI assistance used: no / yes*
- *If yes, summarize how AI was used and what human validation was
performed (build/tests/manual checks).*
Analyze validation report to summarize performance issue table of VLMs
in the report.
---------
Signed-off-by: Min, Byung il <byungil.min@intel.com>