llama.cpp
Add Nemotron/Minitron GGUF Conversion & Inference Support
#8922

Merged

Add Nemotron/Minitron GGUF Conversion & Inference Support #8922

slaren merged 10 commits into ggml-org:master from suhara:nemotron-support

github-actions added python

Vaibhavs10 commented on 2024-08-08

mofosyne added Review Complexity : Medium

ggerganov approved these changes on 2024-08-08

compilade commented on 2024-08-08

Vaibhavs10 commented on 2024-08-12

Add nemotron GGUF conversion & inference support

aa2f4a79

Fix formatting issues

45e9d164

Remove unnecessary write_tensors()

147cdf64

Update convert_hf_to_gguf.py

b841554d

Update src/llama.cpp

092382fe

Address comments by @compilade

6f369f3f

Replace ggml_mul_mat()->llm_build_lora_mm()

ae86b5e3

Remove mutable variable

bd761986

suhara force pushed to bd761986 1 year ago

Vaibhavs10 requested a review from

compilade 1 year ago

Vaibhavs10 commented on 2024-08-13

Vaibhavs10 approved these changes on 2024-08-13

slaren commented on 2024-08-13

Use for bias tensors

e4bb91b0

Cover corner case for role_scaling not in config.json

0645adc8

compilade approved these changes on 2024-08-14

slaren merged 2a24c8ca into master 1 year ago

Reviewers

ggerganov

compilade

Vaibhavs10

slaren

Assignees

No one assigned

Labels

python Review Complexity : Medium

Milestone

No milestone

llama.cpp Add Nemotron/Minitron GGUF Conversion & Inference Support #8922 Merged

Add Nemotron/Minitron GGUF Conversion & Inference Support #8922

llama.cpp
Add Nemotron/Minitron GGUF Conversion & Inference Support
#8922

Merged