Class ShingleTokenFilter.Builder

All Implemented Interfaces:
WithJson<ShingleTokenFilter.Builder>, ObjectBuilder<ShingleTokenFilter>
Enclosing class:
ShingleTokenFilter

public static class ShingleTokenFilter.Builder extends TokenFilterBase.AbstractBuilder<ShingleTokenFilter.Builder> implements ObjectBuilder<ShingleTokenFilter>
Builder for ShingleTokenFilter.
  • Constructor Details

    • Builder

      public Builder()
  • Method Details

    • fillerToken

      public final ShingleTokenFilter.Builder fillerToken(@Nullable String value)
      String used in shingles as a replacement for empty positions that do not contain a token. This filler token is only used in shingles, not original unigrams. Defaults to an underscore (_).

      API name: filler_token

    • maxShingleSize

      public final ShingleTokenFilter.Builder maxShingleSize(@Nullable Integer value)
      Maximum number of tokens to concatenate when creating shingles. Defaults to 2.

      API name: max_shingle_size

    • minShingleSize

      public final ShingleTokenFilter.Builder minShingleSize(@Nullable Integer value)
      Minimum number of tokens to concatenate when creating shingles. Defaults to 2.

      API name: min_shingle_size

    • outputUnigrams

      public final ShingleTokenFilter.Builder outputUnigrams(@Nullable Boolean value)
      If true, the output includes the original input tokens. If false, the output only includes shingles; the original input tokens are removed. Defaults to true.

      API name: output_unigrams

    • outputUnigramsIfNoShingles

      public final ShingleTokenFilter.Builder outputUnigramsIfNoShingles(@Nullable Boolean value)
      If true, the output includes the original input tokens only if no shingles are produced; if shingles are produced, the output only includes shingles. Defaults to false.

      API name: output_unigrams_if_no_shingles

    • tokenSeparator

      public final ShingleTokenFilter.Builder tokenSeparator(@Nullable String value)
      Separator used to concatenate adjacent tokens to form a shingle. Defaults to a space (" ").

      API name: token_separator

    • self

      protected ShingleTokenFilter.Builder self()
      Specified by:
      self in class TokenFilterBase.AbstractBuilder<ShingleTokenFilter.Builder>
    • build

      public ShingleTokenFilter build()
      Specified by:
      build in interface ObjectBuilder<ShingleTokenFilter>
      Throws:
      NullPointerException - if some of the required fields are null.