buzz/docs/docs/cli.md at 950e56ea6f2806717094c244dfc372b2c829ec03

mirror of https://github.com/chidiwilliams/buzz.git synced 2026-03-15 15:15:49 +01:00

Raivis Dejus 7f1fd31f43

Adding word level timestamps to CLI (#913 )

2024-09-28 12:30:28 +00:00

4.8 KiB

Raw Blame History

title	sidebar_position
CLI	5

Commands

`add`

Start a new transcription task.

Usage: buzz add [options] [file file file...]

Options:
  -t, --task <task>              The task to perform. Allowed: translate,
                                 transcribe. Default: transcribe.
  -m, --model-type <model-type>  Model type. Allowed: whisper, whispercpp,
                                 huggingface, fasterwhisper, openaiapi. Default:
                                 whisper.
  -s, --model-size <model-size>  Model size. Use only when --model-type is
                                 whisper, whispercpp, or fasterwhisper. Allowed:
                                 tiny, base, small, medium, large. Default:
                                 tiny.
  --hfid <id>                    Hugging Face model ID. Use only when
                                 --model-type is huggingface. Example:
                                 "openai/whisper-tiny"
  -l, --language <code>          Language code. Allowed: af (Afrikaans), am
                                 (Amharic), ar (Arabic), as (Assamese), az
                                 (Azerbaijani), ba (Bashkir), be (Belarusian),
                                 bg (Bulgarian), bn (Bengali), bo (Tibetan), br
                                 (Breton), bs (Bosnian), ca (Catalan), cs
                                 (Czech), cy (Welsh), da (Danish), de (German),
                                 el (Greek), en (English), es (Spanish), et
                                 (Estonian), eu (Basque), fa (Persian), fi
                                 (Finnish), fo (Faroese), fr (French), gl
                                 (Galician), gu (Gujarati), ha (Hausa), haw
                                 (Hawaiian), he (Hebrew), hi (Hindi), hr
                                 (Croatian), ht (Haitian Creole), hu
                                 (Hungarian), hy (Armenian), id (Indonesian), is
                                 (Icelandic), it (Italian), ja (Japanese), jw
                                 (Javanese), ka (Georgian), kk (Kazakh), km
                                 (Khmer), kn (Kannada), ko (Korean), la (Latin),
                                 lb (Luxembourgish), ln (Lingala), lo (Lao), lt
                                 (Lithuanian), lv (Latvian), mg (Malagasy), mi
                                 (Maori), mk (Macedonian), ml (Malayalam), mn
                                 (Mongolian), mr (Marathi), ms (Malay), mt
                                 (Maltese), my (Myanmar), ne (Nepali), nl
                                 (Dutch), nn (Nynorsk), no (Norwegian), oc
                                 (Occitan), pa (Punjabi), pl (Polish), ps
                                 (Pashto), pt (Portuguese), ro (Romanian), ru
                                 (Russian), sa (Sanskrit), sd (Sindhi), si
                                 (Sinhala), sk (Slovak), sl (Slovenian), sn
                                 (Shona), so (Somali), sq (Albanian), sr
                                 (Serbian), su (Sundanese), sv (Swedish), sw
                                 (Swahili), ta (Tamil), te (Telugu), tg (Tajik),
                                 th (Thai), tk (Turkmen), tl (Tagalog), tr
                                 (Turkish), tt (Tatar), uk (Ukrainian), ur
                                 (Urdu), uz (Uzbek), vi (Vietnamese), yi
                                 (Yiddish), yo (Yoruba), zh (Chinese). Leave
                                 empty to detect language.
  -p, --prompt <prompt>          Initial prompt.
  -wt, --word-timestamps         Generate word-level timestamps.
  --openai-token <token>         OpenAI access token. Use only when
                                 --model-type is openaiapi. Defaults to your
                                 previously saved access token, if one exists.
  --srt                          Output result in an SRT file.
  --vtt                          Output result in a VTT file.
  --txt                          Output result in a TXT file.
  -h, --help                     Displays help on commandline options.
  --help-all                     Displays help including Qt specific options.
  -v, --version                  Displays version information.

Arguments:
  files                          Input file paths

Examples:

# Translate two MP3 files from French to English using OpenAI Whisper API
buzz add --task translate --language fr --model-type openaiapi /Users/user/Downloads/1b3b03e4-8db5-ea2c-ace5-b71ff32e3304.mp3 /Users/user/Downloads/koaf9083k1lkpsfdi0.mp3

# Transcribe an MP4 using Whisper.cpp "small" model and immediately export to SRT and VTT files
buzz add --task transcribe --model-type whispercpp --model-size small --prompt "My initial prompt" --srt --vtt /Users/user/Downloads/buzz/1b3b03e4-8db5-ea2c-ace5-b71ff32e3304.mp4

4.8 KiB Raw Blame History

Commands

add

4.8 KiB

Raw Blame History

`add`