Class WhisperFullParams

java.lang.Object
com.sun.jna.Structure
io.github.ggerganov.whispercpp.params.WhisperFullParams

public class WhisperFullParams extends com.sun.jna.Structure
Parameters for the whisper_full() function. If you change the order or add new parameters, make sure to update the default values in whisper.cpp: whisper_full_default_params()
  • Field Details

    • strategy

      public int strategy
      Sampling strategy for whisper_full() function.
    • n_threads

      public int n_threads
      Number of threads. (default = 4)
    • n_max_text_ctx

      public int n_max_text_ctx
      Maximum tokens to use from past text as a prompt for the decoder. (default = 16384)
    • offset_ms

      public int offset_ms
      Start offset in milliseconds. (default = 0)
    • duration_ms

      public int duration_ms
      Audio duration to process in milliseconds. (default = 0)
    • translate

      public CBool translate
      Translate flag. (default = false)
    • no_context

      public CBool no_context
      Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder. (default = true)
    • single_segment

      public CBool single_segment
      Flag to force single segment output (useful for streaming). (default = false)
    • token_timestamps

      public CBool token_timestamps
      [EXPERIMENTAL] Flag to enable token-level timestamps. (default = false)
    • thold_pt

      public float thold_pt
      [EXPERIMENTAL] Timestamp token probability threshold (~0.01). (default = 0.01)
    • thold_ptsum

      public float thold_ptsum
      [EXPERIMENTAL] Timestamp token sum probability threshold (~0.01).
    • max_len

      public int max_len
      Maximum segment length in characters. (default = 0)
    • split_on_word

      public CBool split_on_word
      Flag to split on word rather than on token (when used with max_len). (default = false)
    • max_tokens

      public int max_tokens
      Maximum tokens per segment (0, default = no limit)
    • speed_up

      public CBool speed_up
      Flag to speed up the audio by 2x using Phase Vocoder. (default = false)
    • audio_ctx

      public int audio_ctx
      Overwrite the audio context size (0 = use default).
    • tdrz_enable

      public CBool tdrz_enable
      Enable tinydiarize (default = false)
    • initial_prompt

      public String initial_prompt
      Tokens to provide to the whisper decoder as an initial prompt. These are prepended to any existing text context from a previous call.
    • prompt_tokens

      public com.sun.jna.Pointer prompt_tokens
      Prompt tokens. (int*)
    • prompt_n_tokens

      public int prompt_n_tokens
      Number of prompt tokens.
    • language

      public String language
      Language for auto-detection. For auto-detection, set to `null`, `""`, or "auto".
    • detect_language

      public CBool detect_language
      Flag to indicate whether to detect language automatically.
    • suppress_blank

      public CBool suppress_blank
      Flag to suppress blank tokens.
    • suppress_non_speech_tokens

      public CBool suppress_non_speech_tokens
      Flag to suppress non-speech tokens.
    • temperature

      public float temperature
      Initial decoding temperature.
    • max_initial_ts

      public float max_initial_ts
      Maximum initial timestamp.
    • length_penalty

      public float length_penalty
      Length penalty.
    • temperature_inc

      public float temperature_inc
      Temperature increment.
    • entropy_thold

      public float entropy_thold
      Entropy threshold (similar to OpenAI's "compression_ratio_threshold").
    • logprob_thold

      public float logprob_thold
      Log probability threshold.
    • no_speech_thold

      public float no_speech_thold
      No speech threshold.
    • greedy

      public GreedyParams greedy
      Greedy decoding parameters.
    • new_segment_callback

      public com.sun.jna.Pointer new_segment_callback
      Callback for every newly generated text segment. WhisperNewSegmentCallback
    • new_segment_callback_user_data

      public com.sun.jna.Pointer new_segment_callback_user_data
      User data for the new_segment_callback.
    • progress_callback

      public com.sun.jna.Pointer progress_callback
      Callback on each progress update. WhisperProgressCallback
    • progress_callback_user_data

      public com.sun.jna.Pointer progress_callback_user_data
      User data for the progress_callback.
    • encoder_begin_callback

      public com.sun.jna.Pointer encoder_begin_callback
      Callback each time before the encoder starts. WhisperEncoderBeginCallback
    • encoder_begin_callback_user_data

      public com.sun.jna.Pointer encoder_begin_callback_user_data
      User data for the encoder_begin_callback.
    • logits_filter_callback

      public com.sun.jna.Pointer logits_filter_callback
      Callback by each decoder to filter obtained logits. WhisperLogitsFilterCallback
    • logits_filter_callback_user_data

      public com.sun.jna.Pointer logits_filter_callback_user_data
      User data for the logits_filter_callback.
  • Constructor Details

    • WhisperFullParams

      public WhisperFullParams(com.sun.jna.Pointer p)
  • Method Details

    • transcribeMode

      public void transcribeMode()
      The compliment of translateMode()
    • translateMode

      public void translateMode()
      The compliment of transcribeMode()
    • enableContext

      public void enableContext(boolean enable)
      Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder. (default = true)
    • singleSegment

      public void singleSegment(boolean single)
      Flag to force single segment output (useful for streaming). (default = false)
    • printSpecial

      public void printSpecial(boolean enable)
      Flag to print special tokens (e.g., <SOT>, <EOT>, <BEG>, etc.). (default = false)
    • printProgress

      public void printProgress(boolean enable)
      Flag to print progress information. (default = true)
    • printRealtime

      public void printRealtime(boolean enable)
      Flag to print results from within whisper.cpp (avoid it, use callback instead). (default = true)
    • printTimestamps

      public void printTimestamps(boolean enable)
      Flag to print timestamps for each text segment when printing realtime. (default = true)
    • tokenTimestamps

      public void tokenTimestamps(boolean enable)
      [EXPERIMENTAL] Flag to enable token-level timestamps. (default = false)
    • splitOnWord

      public void splitOnWord(boolean enable)
      Flag to split on word rather than on token (when used with max_len). (default = false)
    • speedUp

      public void speedUp(boolean enable)
      Flag to speed up the audio by 2x using Phase Vocoder. (default = false)
    • tdrzEnable

      public void tdrzEnable(boolean enable)
      Enable tinydiarize (default = false)
    • setPromptTokens

      public void setPromptTokens(int[] tokens)
    • detectLanguage

      public void detectLanguage(boolean enable)
      Flag to indicate whether to detect language automatically.
    • suppressBlanks

      public void suppressBlanks(boolean enable)
    • suppressNonSpeechTokens

      public void suppressNonSpeechTokens(boolean enable)
      Flag to suppress non-speech tokens.
    • setBestOf

      public void setBestOf(int bestOf)
    • setBeamSize

      public void setBeamSize(int beamSize)
    • setBeamSizeAndPatience

      public void setBeamSizeAndPatience(int beamSize, float patience)
    • setNewSegmentCallback

      public void setNewSegmentCallback(WhisperNewSegmentCallback callback)
    • setProgressCallback

      public void setProgressCallback(WhisperProgressCallback callback)
    • setEncoderBeginCallbackeginCallbackCallback

      public void setEncoderBeginCallbackeginCallbackCallback(WhisperEncoderBeginCallback callback)
    • setLogitsFilterCallback

      public void setLogitsFilterCallback(WhisperLogitsFilterCallback callback)
    • getFieldOrder

      protected List<String> getFieldOrder()
      Overrides:
      getFieldOrder in class com.sun.jna.Structure