llama.cpp
Add Nemotron/Minitron GGUF Conversion & Inference Support
#8922
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
10
Changes
View On
GitHub
Add Nemotron/Minitron GGUF Conversion & Inference Support
#8922
slaren
merged 10 commits into
ggml-org:master
from
suhara:nemotron-support
github-actions
added
python
Vaibhavs10
commented on 2024-08-08
mofosyne
added
Review Complexity : Medium
ggerganov
approved these changes on 2024-08-08
compilade
commented on 2024-08-08
Vaibhavs10
commented on 2024-08-12
Add nemotron GGUF conversion & inference support
aa2f4a79
Fix formatting issues
45e9d164
Remove unnecessary write_tensors()
147cdf64
Update convert_hf_to_gguf.py
b841554d
Update src/llama.cpp
092382fe
Address comments by @compilade
6f369f3f
Replace ggml_mul_mat()->llm_build_lora_mm()
ae86b5e3
Remove mutable variable
bd761986
suhara
force pushed
to
bd761986
1 year ago
Vaibhavs10
requested a review
from
compilade
1 year ago
Vaibhavs10
commented on 2024-08-13
Vaibhavs10
approved these changes on 2024-08-13
slaren
commented on 2024-08-13
Use for bias tensors
e4bb91b0
Cover corner case for role_scaling not in config.json
0645adc8
compilade
approved these changes on 2024-08-14
slaren
merged
2a24c8ca
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
compilade
ggerganov
Vaibhavs10
slaren
Assignees
No one assigned
Labels
python
Review Complexity : Medium
Milestone
No milestone
Login to write a write a comment.
Login via GitHub