llama.cpp
Add Nemotron/Minitron GGUF Conversion & Inference Support
#8922
Merged

Add Nemotron/Minitron GGUF Conversion & Inference Support #8922

slaren merged 10 commits into ggml-org:master from suhara:nemotron-support
suhara
github-actions github-actions added python
Vaibhavs10
Vaibhavs10 commented on 2024-08-08
suhara
mofosyne mofosyne added Review Complexity : Medium
ggerganov
ggerganov approved these changes on 2024-08-08
compilade
compilade commented on 2024-08-08
suhara
Vaibhavs10
Vaibhavs10 commented on 2024-08-12
suhara Add nemotron GGUF conversion & inference support
aa2f4a79
suhara Fix formatting issues
45e9d164
suhara Remove unnecessary write_tensors()
147cdf64
suhara Update convert_hf_to_gguf.py
b841554d
suhara Update src/llama.cpp
092382fe
suhara Address comments by @compilade
6f369f3f
suhara Replace ggml_mul_mat()->llm_build_lora_mm()
ae86b5e3
suhara Remove mutable variable
bd761986
suhara suhara force pushed to bd761986 1 year ago
suhara
Vaibhavs10 Vaibhavs10 requested a review from compilade compilade 1 year ago
Vaibhavs10
Vaibhavs10 commented on 2024-08-13
Vaibhavs10
Vaibhavs10 approved these changes on 2024-08-13
slaren
slaren commented on 2024-08-13
suhara Use for bias tensors
e4bb91b0
suhara Cover corner case for role_scaling not in config.json
0645adc8
suhara
compilade
compilade approved these changes on 2024-08-14
suhara
slaren slaren merged 2a24c8ca into master 1 year ago
schmorp
suhara
schmorp
schmorp
schmorp
suhara
schmorp
schmorp
nicoboss
mgoin
nicoboss
schmorp

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone