whisper
Expose punctuation options in cli and transcribe()
#973
Merged

Expose punctuation options in cli and transcribe() #973

ryanheise
ryanheise2 years ago

Allows the prepend_punctuations and append_punctuations options to be set from the CLI or from a python program that calls transcribe().

ryanheise Expose punctuation options in cli and transcribe()
d0e16b34
jongwook
jongwook2 years ago👍 1

Thanks! I'll merge this first and move those magic strings as a global variable somewhere in #869.

jongwook jongwook merged 8eb29c3e into word-level-timestamps 2 years ago
ryanheise
ryanheise2 years ago

I expected so, and of course there's more of it where that came from ;-)

def transcribe(
    model: "Whisper",
    audio: Union[str, np.ndarray, torch.Tensor],
    *,
    verbose: Optional[bool] = None,
    temperature: Union[float, Tuple[float, ...]] = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0),
    compression_ratio_threshold: Optional[float] = 2.4,
    logprob_threshold: Optional[float] = -1.0,
    no_speech_threshold: Optional[float] = 0.6,
    condition_on_previous_text: bool = True,
    initial_prompt: Optional[str] = None,
    word_timestamps: bool = False,
    prepend_punctuations: str = "\"\'“¿([{-",
    append_punctuations: str = "\"\'.。,,!!??::”)]}、",
    **decode_options,
):

One of these magic values actually doesn't match up with the default parameter values in cli():

def cli():
    ...
    parser.add_argument("--temperature", type=float, default=0, help="temperature to use for sampling")

So when used from the command line, the temperature will default to 0, and via transcribe() the temperature will default to (0.0, 0.2, 0.4, 0.6, 0.8, 1.0).

ryanheise ryanheise deleted the propagate-punctuation-options branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone