[Bugfix] Remove hardcoded `head_size=256` for Deepseek v2 and v3 #12067
fix deepseek v2 and v3 head dim
e3486379
mgoin
approved these changes
on 2025-01-15
update test head_sizes
fb21cf0e
remove comment
7856937d
fix test head_sizes
9c9dd006
Isotr0py
enabled auto-merge (squash) 1 year ago
Isotr0py
merged
dd7c9ad8
into main 1 year ago
Isotr0py
deleted the fix-deepseek-head branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub