Class WhisperFullParams
java.lang.Object
com.sun.jna.Structure
io.github.ggerganov.whispercpp.params.WhisperFullParams
public class WhisperFullParams
extends com.sun.jna.Structure
Parameters for the whisper_full() function.
If you change the order or add new parameters, make sure to update the default values in whisper.cpp:
whisper_full_default_params()
-
Nested Class Summary
Nested classes/interfaces inherited from class com.sun.jna.Structure
com.sun.jna.Structure.ByReference, com.sun.jna.Structure.ByValue, com.sun.jna.Structure.FieldOrder, com.sun.jna.Structure.StructField -
Field Summary
FieldsModifier and TypeFieldDescriptionintOverwrite the audio context size (0 = use default).Beam search decoding parameters.Flag to indicate whether to detect language automatically.intAudio duration to process in milliseconds.com.sun.jna.PointerCallback each time before the encoder starts.com.sun.jna.PointerUser data for the encoder_begin_callback.floatEntropy threshold (similar to OpenAI's "compression_ratio_threshold").Greedy decoding parameters.Tokens to provide to the whisper decoder as an initial prompt.Language for auto-detection.floatLength penalty.com.sun.jna.PointerCallback by each decoder to filter obtained logits.com.sun.jna.PointerUser data for the logits_filter_callback.floatLog probability threshold.floatMaximum initial timestamp.intMaximum segment length in characters.intMaximum tokens per segment (0, default = no limit)intMaximum tokens to use from past text as a prompt for the decoder.intNumber of threads.com.sun.jna.PointerCallback for every newly generated text segment.com.sun.jna.PointerUser data for the new_segment_callback.Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder.floatNo speech threshold.intStart offset in milliseconds.Flag to print progress information.Flag to print results from within whisper.cpp (avoid it, use callback instead).Flag to print special tokens (e.g., <SOT>, <EOT>, <BEG>, etc.).Flag to print timestamps for each text segment when printing realtime.com.sun.jna.PointerCallback on each progress update.com.sun.jna.PointerUser data for the progress_callback.intNumber of prompt tokens.com.sun.jna.PointerPrompt tokens.Flag to force single segment output (useful for streaming).Flag to speed up the audio by 2x using Phase Vocoder.Flag to split on word rather than on token (when used with max_len).intSampling strategy for whisper_full() function.Flag to suppress blank tokens.Flag to suppress non-speech tokens.Enable tinydiarize (default = false)floatInitial decoding temperature.floatTemperature increment.float[EXPERIMENTAL] Timestamp token probability threshold (~0.01).float[EXPERIMENTAL] Timestamp token sum probability threshold (~0.01).[EXPERIMENTAL] Flag to enable token-level timestamps.Translate flag.Fields inherited from class com.sun.jna.Structure
ALIGN_DEFAULT, ALIGN_GNUC, ALIGN_MSVC, ALIGN_NONE, CALCULATE_SIZE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoiddetectLanguage(boolean enable) Flag to indicate whether to detect language automatically.voidenableContext(boolean enable) Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder.voidprintProgress(boolean enable) Flag to print progress information.voidprintRealtime(boolean enable) Flag to print results from within whisper.cpp (avoid it, use callback instead).voidprintSpecial(boolean enable) Flag to print special tokens (e.g., <SOT>, <EOT>, <BEG>, etc.).voidprintTimestamps(boolean enable) Flag to print timestamps for each text segment when printing realtime.voidsetBeamSize(int beamSize) voidsetBeamSizeAndPatience(int beamSize, float patience) voidsetBestOf(int bestOf) voidvoidvoidvoidsetProgressCallback(WhisperProgressCallback callback) voidsetPromptTokens(int[] tokens) voidsingleSegment(boolean single) Flag to force single segment output (useful for streaming).voidspeedUp(boolean enable) Flag to speed up the audio by 2x using Phase Vocoder.voidsplitOnWord(boolean enable) Flag to split on word rather than on token (when used with max_len).voidsuppressBlanks(boolean enable) voidsuppressNonSpeechTokens(boolean enable) Flag to suppress non-speech tokens.voidtdrzEnable(boolean enable) Enable tinydiarize (default = false)voidtokenTimestamps(boolean enable) [EXPERIMENTAL] Flag to enable token-level timestamps.voidThe compliment of translateMode()voidThe compliment of transcribeMode()Methods inherited from class com.sun.jna.Structure
allocateMemory, allocateMemory, autoAllocate, autoRead, autoRead, autoWrite, autoWrite, cacheTypeInfo, calculateSize, clear, createFieldsOrder, createFieldsOrder, createFieldsOrder, createFieldsOrder, dataEquals, dataEquals, ensureAllocated, equals, fieldOffset, getAutoRead, getAutoWrite, getFieldList, getFields, getNativeAlignment, getNativeSize, getNativeSize, getPointer, getStringEncoding, getStructAlignment, hashCode, newInstance, newInstance, read, readField, readField, setAlignType, setAutoRead, setAutoSynch, setAutoWrite, setStringEncoding, size, sortFields, toArray, toArray, toString, toString, useMemory, useMemory, write, writeField, writeField, writeField
-
Field Details
-
strategy
public int strategySampling strategy for whisper_full() function. -
n_threads
public int n_threadsNumber of threads. (default = 4) -
n_max_text_ctx
public int n_max_text_ctxMaximum tokens to use from past text as a prompt for the decoder. (default = 16384) -
offset_ms
public int offset_msStart offset in milliseconds. (default = 0) -
duration_ms
public int duration_msAudio duration to process in milliseconds. (default = 0) -
translate
Translate flag. (default = false) -
no_context
Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder. (default = true) -
single_segment
Flag to force single segment output (useful for streaming). (default = false) -
print_special
Flag to print special tokens (e.g., <SOT>, <EOT>, <BEG>, etc.). (default = false) -
print_progress
Flag to print progress information. (default = true) -
print_realtime
Flag to print results from within whisper.cpp (avoid it, use callback instead). (default = true) -
print_timestamps
Flag to print timestamps for each text segment when printing realtime. (default = true) -
token_timestamps
[EXPERIMENTAL] Flag to enable token-level timestamps. (default = false) -
thold_pt
public float thold_pt[EXPERIMENTAL] Timestamp token probability threshold (~0.01). (default = 0.01) -
thold_ptsum
public float thold_ptsum[EXPERIMENTAL] Timestamp token sum probability threshold (~0.01). -
max_len
public int max_lenMaximum segment length in characters. (default = 0) -
split_on_word
Flag to split on word rather than on token (when used with max_len). (default = false) -
max_tokens
public int max_tokensMaximum tokens per segment (0, default = no limit) -
speed_up
Flag to speed up the audio by 2x using Phase Vocoder. (default = false) -
audio_ctx
public int audio_ctxOverwrite the audio context size (0 = use default). -
tdrz_enable
Enable tinydiarize (default = false) -
initial_prompt
Tokens to provide to the whisper decoder as an initial prompt. These are prepended to any existing text context from a previous call. -
prompt_tokens
public com.sun.jna.Pointer prompt_tokensPrompt tokens. (int*) -
prompt_n_tokens
public int prompt_n_tokensNumber of prompt tokens. -
language
Language for auto-detection. For auto-detection, set to `null`, `""`, or "auto". -
detect_language
Flag to indicate whether to detect language automatically. -
suppress_blank
Flag to suppress blank tokens. -
suppress_non_speech_tokens
Flag to suppress non-speech tokens. -
temperature
public float temperatureInitial decoding temperature. -
max_initial_ts
public float max_initial_tsMaximum initial timestamp. -
length_penalty
public float length_penaltyLength penalty. -
temperature_inc
public float temperature_incTemperature increment. -
entropy_thold
public float entropy_tholdEntropy threshold (similar to OpenAI's "compression_ratio_threshold"). -
logprob_thold
public float logprob_tholdLog probability threshold. -
no_speech_thold
public float no_speech_tholdNo speech threshold. -
greedy
Greedy decoding parameters. -
beam_search
Beam search decoding parameters. -
new_segment_callback
public com.sun.jna.Pointer new_segment_callbackCallback for every newly generated text segment. WhisperNewSegmentCallback -
new_segment_callback_user_data
public com.sun.jna.Pointer new_segment_callback_user_dataUser data for the new_segment_callback. -
progress_callback
public com.sun.jna.Pointer progress_callbackCallback on each progress update. WhisperProgressCallback -
progress_callback_user_data
public com.sun.jna.Pointer progress_callback_user_dataUser data for the progress_callback. -
encoder_begin_callback
public com.sun.jna.Pointer encoder_begin_callbackCallback each time before the encoder starts. WhisperEncoderBeginCallback -
encoder_begin_callback_user_data
public com.sun.jna.Pointer encoder_begin_callback_user_dataUser data for the encoder_begin_callback. -
logits_filter_callback
public com.sun.jna.Pointer logits_filter_callbackCallback by each decoder to filter obtained logits. WhisperLogitsFilterCallback -
logits_filter_callback_user_data
public com.sun.jna.Pointer logits_filter_callback_user_dataUser data for the logits_filter_callback.
-
-
Constructor Details
-
WhisperFullParams
public WhisperFullParams(com.sun.jna.Pointer p)
-
-
Method Details
-
transcribeMode
public void transcribeMode()The compliment of translateMode() -
translateMode
public void translateMode()The compliment of transcribeMode() -
enableContext
public void enableContext(boolean enable) Flag to indicate whether to use past transcription (if any) as an initial prompt for the decoder. (default = true) -
singleSegment
public void singleSegment(boolean single) Flag to force single segment output (useful for streaming). (default = false) -
printSpecial
public void printSpecial(boolean enable) Flag to print special tokens (e.g., <SOT>, <EOT>, <BEG>, etc.). (default = false) -
printProgress
public void printProgress(boolean enable) Flag to print progress information. (default = true) -
printRealtime
public void printRealtime(boolean enable) Flag to print results from within whisper.cpp (avoid it, use callback instead). (default = true) -
printTimestamps
public void printTimestamps(boolean enable) Flag to print timestamps for each text segment when printing realtime. (default = true) -
tokenTimestamps
public void tokenTimestamps(boolean enable) [EXPERIMENTAL] Flag to enable token-level timestamps. (default = false) -
splitOnWord
public void splitOnWord(boolean enable) Flag to split on word rather than on token (when used with max_len). (default = false) -
speedUp
public void speedUp(boolean enable) Flag to speed up the audio by 2x using Phase Vocoder. (default = false) -
tdrzEnable
public void tdrzEnable(boolean enable) Enable tinydiarize (default = false) -
setPromptTokens
public void setPromptTokens(int[] tokens) -
detectLanguage
public void detectLanguage(boolean enable) Flag to indicate whether to detect language automatically. -
suppressBlanks
public void suppressBlanks(boolean enable) -
suppressNonSpeechTokens
public void suppressNonSpeechTokens(boolean enable) Flag to suppress non-speech tokens. -
setBestOf
public void setBestOf(int bestOf) -
setBeamSize
public void setBeamSize(int beamSize) -
setBeamSizeAndPatience
public void setBeamSizeAndPatience(int beamSize, float patience) -
setNewSegmentCallback
-
setProgressCallback
-
setEncoderBeginCallbackeginCallbackCallback
-
setLogitsFilterCallback
-
getFieldOrder
- Overrides:
getFieldOrderin classcom.sun.jna.Structure
-