llama.cpp
nvidia nemotron nano v2 (nemotronh)
#15507
Merged

nvidia nemotron nano v2 (nemotronh) #15507

gabe-l-hart
github-actions github-actions added python
gabe-l-hart feat: Add NEMOTRONH to python arch enum
17fa9d5e
gabe-l-hart feat: Add NEMOTRONH to c++ arch enum
36c88f73
gabe-l-hart feat: Add NEMOTRONH to llama-arch layer map
62e66c63
gabe-l-hart feat: First pass at conversion for nemotronh
abe1e892
gabe-l-hart feat: Add a verbose log for each tensor loaded
c25c149c
gabe-l-hart feat: First (broken) pass at nemotronh model architecture
828176ed
gabe-l-hart gabe-l-hart force pushed to 828176ed 141 days ago
gabe-l-hart fix: Explicitly enable add_bos_token during conversion
3191a8d1
gabe-l-hart
CISC
gabe-l-hart
gabe-l-hart
gabe-l-hart fix: Use relu2 (LLM_FFN_RELU_SQR) for activation in FFN layers
37c42c95
gabe-l-hart
gabe-l-hart fix: Only allocate attention cache for attention layers (not non-recu…
9a9de40a
gabe-l-hart gabe-l-hart force pushed to 9a9de40a 138 days ago
gabe-l-hart
gabe-l-hart
gabe-l-hart
compilade
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart fix: Move residual add to after every block
9d4e0d72
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart gabe-l-hart force pushed 137 days ago
github-actions github-actions added examples
gabe-l-hart
gabe-l-hart
compilade
gabe-l-hart
gabe-l-hart
gabe-l-hart
gabe-l-hart Merge remote-tracking branch 'origin/master' into gabe-l-hart/nvidia-…
cb03b4f8
gabe-l-hart fix: Use the correct norm tensor for the MLP blocks
3310f915
gabe-l-hart Merge remote-tracking branch 'origin/master' into gabe-l-hart/nvidia-…
3132915e
gabe-l-hart gabe-l-hart force pushed to 3132915e 136 days ago
gabe-l-hart
gabe-l-hart gabe-l-hart marked this pull request as ready for review 136 days ago
gabe-l-hart
gabe-l-hart gabe-l-hart removed examples
gabe-l-hart gabe-l-hart added model
DominguesM Nemotron-H: MLP gate cleanup (pass NULL for unused gate)
ab53234d
DominguesM SSM: respect ssm_dt_rank for dt_dim when provided
b3304da9
gabe-l-hart
DominguesM fix: plamo2 - revert dt_dim to default (remove ssm_dt_rank usage)
4223a1f8
gabe-l-hart Merge pull request #3 from DominguesM/nvidia-nemotron-nano-v2
7503535a
jwjohns
gabe-l-hart
jwjohns Rename nemotronh to nemotron_h for consistency
f2165dd0
gabe-l-hart Merge pull request #4 from jwjohns/nemotron-h-naming-update
3732916a
gabe-l-hart
ggerganov
ggerganov approved these changes on 2025-08-28
ggerganov
gabe-l-hart
jacekpoplawski
gabe-l-hart
gabe-l-hart
jacekpoplawski
gabe-l-hart
gabe-l-hart feat: Support conversion for older NemotronH models
19f1dc60
gabe-l-hart
DominguesM
gabe-l-hart gabe-l-hart merged e8d99dd0 into master 135 days ago
gabe-l-hart gabe-l-hart deleted the gabe-l-hart/nvidia-nemotron-nano-15409 branch 135 days ago
jwjohns
jacekpoplawski
jwjohns
jacekpoplawski

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone