nvidia nemotron nano v2 (nemotronh) #15507
feat: Add NEMOTRONH to python arch enum
17fa9d5e
feat: Add NEMOTRONH to c++ arch enum
36c88f73
feat: Add NEMOTRONH to llama-arch layer map
62e66c63
feat: First pass at conversion for nemotronh
abe1e892
feat: Add a verbose log for each tensor loaded
c25c149c
feat: First (broken) pass at nemotronh model architecture
828176ed
fix: Explicitly enable add_bos_token during conversion
3191a8d1
fix: Use relu2 (LLM_FFN_RELU_SQR) for activation in FFN layers
37c42c95
fix: Only allocate attention cache for attention layers (not non-recu…
9a9de40a
fix: Move residual add to after every block
9d4e0d72
Merge remote-tracking branch 'origin/master' into gabe-l-hart/nvidia-…
cb03b4f8
fix: Use the correct norm tensor for the MLP blocks
3310f915
Merge remote-tracking branch 'origin/master' into gabe-l-hart/nvidia-…
3132915e
gabe-l-hart
marked this pull request as ready for review 136 days ago
Nemotron-H: MLP gate cleanup (pass NULL for unused gate)
ab53234d
SSM: respect ssm_dt_rank for dt_dim when provided
b3304da9
fix: plamo2 - revert dt_dim to default (remove ssm_dt_rank usage)
4223a1f8
Merge pull request #3 from DominguesM/nvidia-nemotron-nano-v2
7503535a
Rename nemotronh to nemotron_h for consistency
f2165dd0
Merge pull request #4 from jwjohns/nemotron-h-naming-update
3732916a
ggerganov
approved these changes
on 2025-08-28
feat: Support conversion for older NemotronH models
19f1dc60
gabe-l-hart
deleted the gabe-l-hart/nvidia-nemotron-nano-15409 branch 135 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub