whisper.cpp
vad : add initial Voice Activity Detection (VAD) support
#3065
Merged

vad : add initial Voice Activity Detection (VAD) support #3065

danbev merged 49 commits into ggml-org:master from danbev:vad
danbev
danbev danbev force pushed from 0f2fe06f to b10e6ddd 255 days ago
danbev danbev force pushed from 4aea8b3a to 3cca1a23 254 days ago
danbev danbev force pushed from 5758650f to 9f0ed3d8 252 days ago
tannisroot
danbev
danbev danbev force pushed from 63d3fe24 to ebc79f8a 249 days ago
danbev danbev force pushed from b59768bb to 798695fc 248 days ago
danbev danbev marked this pull request as ready for review 248 days ago
ggerganov
danbev
danbev danbev force pushed from 798695fc to 6b56b7df 244 days ago
danbev
mrfragger
ggerganov
ggerganov commented on 2025-05-07
ggerganov
TeslaKang
ggerganov ggerganov force pushed from 5a6236e9 to 60d561b8 238 days ago
ggerganov
danbev
ggerganov
ggerganov commented on 2025-05-09
danbev
danbev commented on 2025-05-09
danbev vad : add initial Voice Activity Detection (VAD) support
871da0bf
danbev examples : add VAD parameters to CLI [no ci]
24901683
danbev ci : add job to test VAD
eb23253b
danbev vad : map timestamps to original audio
59252c2b
danbev squash! vad : add initial Voice Activity Detection (VAD) support [no ci]
37a36a33
danbev vad : extract VAD processing to a separate function
033c0ce2
danbev vad : add TODOs to optimize segment access [no ci]
028481e9
danbev vad : only use CPU backend for VAD processing [no ci]
fc7ebf20
danbev tests : fix strcmp assert and use beam search
3276232e
danbev vad : dont reshape stft_forward_basis tensor
abc05c5c
danbev vad : use ggml_row_size() and rename hdim_bytes to hdim_size
0e18ceba
danbev vad : remove unnecessary ggml_cont
9bf1b4b3
danbev vad : fix typo in log message
dc529950
danbev vad : don't use left leaning ref for segment
2b057733
danbev vad : use std::vector<float> instead float pointers
44bdef1b
danbev vad : enable GPU support for VAD but default to false
27eb59bd
danbev vad : use kebab-case and not snake_case for VAD options
643a91bf
danbev vad : add h_state and c_state to whisper_vad_state
e4d43072
danbev vad : always initialize filtered_n_samples to 0
94c3aba8
danbev vad : use orig timestamp for first segment
e70e4861
ggerganov vad : fix buffers and enable GPU support by default
436baeb7
danbev vad : fix use_gpu assert in test-vad.cpp
eb2c83ee
danbev vad : remove unnecessary reserve [no ci]
47c8f02d
danbev vad : add probs to whisper_vad_state
327cdaee
danbev vad : add timing of vad processing [no ci]
bf2b0df9
danbev danbev force pushed from 50337e26 to bf2b0df9 237 days ago
danbev
enesgrahovac
ggerganov vad : force GPU off for now
243e0dba
ggerganov vad : minor style and naming changes
65c421de
ggerganov vad : minor style
cae38fda
ggerganov vad : remove obsolete whisper_vad_free_speech
cd953ebf
ggerganov vad : refactor whiser_vad_params API
f42e6e47
ggerganov vad : simplify whisper_vad_timestamps_from_probs()
4ff858ba
ggerganov vad : refactor whisper_vad_timestamps_from_probs to use C++
13a75177
ggerganov ggerganov force pushed from 66bc585e to 13a75177 237 days ago
ggerganov
danbev vad : make whisper_vad_timestamps oblique in API
3bcc44c2
danbev vad : rename whisper_vad_speech to whisper_vad_probs
5543c80c
ggerganov
ggerganov commented on 2025-05-10
danbev vad : move whisper_vad_segment to whisper.cpp
8b6f19cf
danbev vad : make segments vector a std::vector
7625ba16
danbev vad : use std::vector for segments in whisper_vad_timestamps_from_probs
b0b2f9b4
danbev vad : rename pcmf32 parameters to samples [no ci]
20fe0b35
ggerganov
ggerganov commented on 2025-05-10
ggerganov
ggerganov commented on 2025-05-10
ggerganov
ggerganov commented on 2025-05-10
danbev vad : remove n_segments from struct whisper_vad_timestamps
f2123105
MahmoudAshraf97
danbev vad : rename whisper_vad_timestamps to whisper_vad_segments [no ci]
4c7fe00c
danbev
danbev vad : remove whisper_vad_probs struct [no ci]
dc541f93
danbev vad : remove whisper_vad_state struct
163ad538
TeslaKang
danbev vad : remove window_size_samples from VAD params
810981f4
danbev vad : clarify VAD CLI options [no ci]
050038ca
danbev docs : add VAD section to README.md [no ci]
3cff6587
danbev squash! docs : add VAD section to README.md [no ci]
acc8747d
ggerganov vad : minor rename
7aac6eca
ggerganov
ggerganov approved these changes on 2025-05-12
ggerganov
ggerganov commented on 2025-05-12
danbev squash! docs : add VAD section to README.md [no ci]
41c20100
danbev vad : fix cli option names [no ci]
67f0fd40
danbev danbev merged e41bc5c6 into master 234 days ago
vrs
danbev
mdestagnol
danbev
ggerganov
mdestagnol
danbev
WilliamTambellini
stephantucker992-prog
stephantucker992-prog commented on 2025-10-01

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone