llama.cpp
ggml : use dynamic thread scheduling for matrix multiplication
#6915
Merged

ggml : use dynamic thread scheduling for matrix multiplication #6915

kunnis
github-actions
Jeximo
p-e-w
slaren
BarfingLemurs
kunnis
USBhost
kunnis
kunnis
kunnis
LostRuins
kunnis
kunnis
kunnis kunnis force pushed from afe52262 to 6eb46e26 1 year ago
kunnis kunnis force pushed from 6eb46e26 to 54c2460c 1 year ago
kunnis
Jeximo
kunnis
kunnis kunnis marked this pull request as ready for review 1 year ago
Jeximo
slaren
cpumaxx
kunnis
slaren
kunnis
slaren
kunnis
slaren
kunnis
slaren
slaren
kunnis
bmtwl
kunnis
bmtwl
kunnis
bmtwl
kunnis
kunnis
slaren
cpumaxx
kunnis
kunnis
slaren
cpumaxx
bmtwl
kunnis
cpumaxx
kunnis
kunnis
cpumaxx
kunnis
cpumaxx
kunnis
cpumaxx
kunnis
cpumaxx
kunnis
cpumaxx
kunnis
cpumaxx
kunnis
cpumaxx
kunnis Just reordering some structs.
3024fd6b
kunnis Adding in the calls to mm_pause
5978b6eb
kunnis Passing around the state
e098171a
kunnis Renaming and moving a bunch of variables around.
a968553c
kunnis Extracting the logic to it's own function.
7b932e49
kunnis Moving some variable definitions into the chunk function.
4f95478e
kunnis
cpumaxx
mofosyne mofosyne added enhancement
mofosyne mofosyne added Review Complexity : High
kunnis Moving some variables around
086e5a82
kunnis moving src1_cont inside
209922f5
kunnis Moving row_size
bb1b1d00
kunnis adding the current_chunk
daa87b18
kunnis Reorg the code.
700c782d
kunnis Formatting to match the orig patch
891d5837
kunnis starting to setup the chunking variables
9acaec58
kunnis
mofosyne
cpumaxx
kunnis Starting the buildup of the loop
c0557fa2
kunnis The yield shouldn't be necessary.
4762d79d
kunnis adding the looping structure based on the chunk configuration.
fc7dc515
kunnis Add in the re-chunking code.
807c8252
kunnis
cpumaxx
cpumaxx
kunnis Making it much more likely to rechunk.
974e43be
kunnis
cpumaxx
kunnis disable resizing if numa is enabled.
1c68ea8d
kunnis
cpumaxx
kunnis kunnis force pushed from e2dcf468 to 1c68ea8d 1 year ago
kunnis Updating comments with what we've learned.
bd80601e
kunnis
slaren
ghost
kunnis
slaren
ghost
slaren
ghost
calculatortamer
slaren
ggerganov
kunnis
slaren
calculatortamer
kunnis
slaren
slaren commented on 2024-05-14
kunnis
kunnis Fix formatting
d9ba30a2
kunnis Couple more formatting fixes.
163dbfdd
kunnis
kunnis More style fixes.
6b0c90fc
kunnis
slaren
slaren
slaren approved these changes on 2024-05-14
slaren
kunnis
kunnis Fix Warnings
741a1981
kunnis Going with unused because there's conditional logic that needs it.
2dd9f017
ggerganov
slaren slaren changed the title Draft Idea... CPU Inference... This seems to perform better? ggml : use dynamic thread scheduling for matrix multiplication 1 year ago
slaren Update ggml.c
f2aabab4
slaren Update ggml.c
14c104d1
slaren slaren requested a review from ggerganov ggerganov 1 year ago
ggerganov
ggerganov approved these changes on 2024-05-15
slaren slaren merged e1b40ac3 into master 1 year ago
kunnis kunnis deleted the MMThreadingPerfChange branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone