Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- addFeatureProcessor(FeatureProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Add a single feature processor to the builder.
- addFieldProcessor(FieldProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Add a single field processor to the builder.
- addMetadataExtractor(FieldExtractor<?>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Add a single metadata extractor to the builder.
- addRegexMappingProcessor(String, FieldProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Add a single regex FieldProcessor mapping to the builder.
- aggregate(List<Feature>) - Method in interface org.tribuo.data.text.FeatureAggregator
-
Aggregates feature values with the same names.
- aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.AverageAggregator
- aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.SumAggregator
- aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.UniqueAggregator
- applyCase(String) - Method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
-
Apply the appropriate casing operation.
- AverageAggregator - Class in org.tribuo.data.text.impl
-
A feature aggregator that averages feature values across a feature list.
- AverageAggregator() - Constructor for class org.tribuo.data.text.impl.AverageAggregator
B
- BasicPipeline - Class in org.tribuo.data.text.impl
-
An example implementation of
TextPipeline. - BasicPipeline(Tokenizer, int) - Constructor for class org.tribuo.data.text.impl.BasicPipeline
-
Constructs a basic text pipeline which tokenizes the input and generates word n-gram features in the range 1 to
ngram. - BINARISED_CATEGORICAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Categoricals binarised into separate features.
- BinaryResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
-
A
ResponseProcessorthat takes a single value of the field as the positive class and all other values as the negative class. - BinaryResponseProcessor(String, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
- BinaryResponseProcessor(List<String>, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
- BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
- BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
- BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>, String, String, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
- build(ResponseProcessor<T>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Construct the
RowProcessorrepresented by this builder's state. - Builder() - Constructor for class org.tribuo.data.columnar.RowProcessor.Builder
-
Builder for
RowProcessor, see RowProcessor constructors for argument details.
C
- cacheProvenance() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
- cacheProvenance() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
-
Computes the provenance.
- CALENDAR_QUARTER - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The calendar quarter of the year.
- CasingPreprocessor - Class in org.tribuo.data.text.impl
-
A document preprocessor which uppercases or lowercases the input.
- CasingPreprocessor(CasingPreprocessor.CasingOperation) - Constructor for class org.tribuo.data.text.impl.CasingPreprocessor
-
Construct a casing preprocessor.
- CasingPreprocessor.CasingOperation - Enum in org.tribuo.data.text.impl
-
The possible casing operations.
- CATEGORICAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Unordered categorical features with the values converted into doubles.
- close() - Method in class org.tribuo.data.csv.CSVIterator
- close() - Method in class org.tribuo.data.sql.SQLDataSource
- COLUMNAR - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
A CSV file parsed using a configured
RowProcessor. - ColumnarDataSource<T extends Output<T>> - Class in org.tribuo.data.columnar
-
A
ConfigurableDataSourcebase class which takes columnar data (e.g., csv or DB table rows) and generatesExamples. - ColumnarDataSource() - Constructor for class org.tribuo.data.columnar.ColumnarDataSource
-
For OLCUT.
- ColumnarDataSource(OutputFactory<T>, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.columnar.ColumnarDataSource
-
Constructs a columnar data source with the specified parameters.
- ColumnarFeature - Class in org.tribuo.data.columnar
-
A Feature with extra bookkeeping for use inside the columnar package.
- ColumnarFeature(String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
-
Constructs a
ColumnarFeaturefrom the field name. - ColumnarFeature(String, String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
-
Constructs a
ColumnarFeaturefrom the field name, column entry and value. - ColumnarFeature(String, String, String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
-
Constructs a
ColumnarFeaturewhich is the conjunction of features from two fields. - ColumnarIterator - Class in org.tribuo.data.columnar
-
An abstract class for iterators that read data in to a columnar format, usually from a file of some kind.
- ColumnarIterator() - Constructor for class org.tribuo.data.columnar.ColumnarIterator
-
Constructs a ColumnarIterator wrapped around a buffering spliterator.
- ColumnarIterator(int, int, long) - Constructor for class org.tribuo.data.columnar.ColumnarIterator
-
Constructs a ColumnarIterator wrapped around a buffering spliterator.
- ColumnarIterator.Row - Class in org.tribuo.data.columnar
-
A representation of a row of untyped data from a columnar data source.
- COMMA - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
-
Comma separator.
- CompletelyConfigurableTrainTest - Class in org.tribuo.data
-
Build and run a predictor for a standard dataset.
- CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions - Class in org.tribuo.data
-
Command line options.
- ConfigurableTrainTest - Class in org.tribuo.data
-
Build and run a predictor for a standard dataset.
- ConfigurableTrainTest.ConfigurableTrainTestOptions - Class in org.tribuo.data
-
Command line options.
- ConfigurableTrainTestOptions() - Constructor for class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
- ConfigurableTrainTestOptions() - Constructor for class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
- configured - Variable in class org.tribuo.data.columnar.RowProcessor
-
Has this row processor been configured?
- CONJUNCTION - Static variable in class org.tribuo.data.columnar.ColumnarFeature
-
The string used as the field name of conjunction features.
- connString - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
Connection string to the SQL database
- copy() - Method in class org.tribuo.data.columnar.RowProcessor
-
Deprecated.
- copy(String) - Method in interface org.tribuo.data.columnar.FieldProcessor
-
Returns a copy of this FieldProcessor bound to the supplied newFieldName.
- copy(String) - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- copy(String) - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- copy(String) - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- copy(String) - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- copy(String) - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
-
Note: the copy shares the text pipeline with the original.
- crossValidation - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
Cross-validate the output metrics.
- CSV - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
Simple numeric CSV file.
- CSVDataSource<T extends Output<T>> - Class in org.tribuo.data.csv
-
A
DataSourcefor loading separable data from a text file (e.g., CSV, TSV) and applyingFieldProcessors to it. - CSVDataSource(URI, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data.
- CSVDataSource(URI, RowProcessor<T>, boolean, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data.
- CSVDataSource(URI, RowProcessor<T>, boolean, char, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
- CSVDataSource(URI, RowProcessor<T>, boolean, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
- CSVDataSource(Path, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data.
- CSVDataSource(Path, RowProcessor<T>, boolean, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data.
- CSVDataSource(Path, RowProcessor<T>, boolean, char, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
- CSVDataSource(Path, RowProcessor<T>, boolean, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVDataSource
-
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
- CSVDataSource.CSVDataSourceProvenance - Class in org.tribuo.data.csv
-
Provenance for
CSVDataSource. - CSVDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
-
Deserialization constructor.
- CSVIterator - Class in org.tribuo.data.csv
-
An iterator over a CSV file.
- CSVIterator(Reader) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied Reader.
- CSVIterator(Reader, char, char) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied Reader.
- CSVIterator(Reader, char, char, String[]) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied Reader.
- CSVIterator(Reader, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied Reader.
- CSVIterator(URI) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied URI.
- CSVIterator(URI, char, char) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied URI.
- CSVIterator(URI, char, char, String[]) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied URI.
- CSVIterator(URI, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVIterator
-
Builds a CSVIterator for the supplied URI.
- CSVLoader<T extends Output<T>> - Class in org.tribuo.data.csv
-
Load a DataSource/Dataset from a CSV file.
- CSVLoader(char, char, OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
-
Creates a CSVLoader using the supplied separator, quote and output factory.
- CSVLoader(char, OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
-
Creates a CSVLoader using the supplied separator and output factory.
- CSVLoader(OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
-
Creates a CSVLoader using the supplied output factory.
- CSVLoader.CSVLoaderProvenance - Class in org.tribuo.data.csv
-
Deprecated.Deprecated in 4.2 as CSVLoader now returns a
CSVDataSource. This provenance is kept so older models can still load correctly. - CSVLoaderProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.Deserialization constructor.
- csvQuoteChar - Variable in class org.tribuo.data.DataOptions
-
Quote character in the CSV file.
- csvResponseName - Variable in class org.tribuo.data.DataOptions
-
Response name in the csv file.
- CSVSaver - Class in org.tribuo.data.csv
-
Saves a Dataset in CSV format suitable for loading by
CSVLoader. - CSVSaver() - Constructor for class org.tribuo.data.csv.CSVSaver
-
Builds a CSV saver using the default separator and quote from
CSVIterator. - CSVSaver(char, char) - Constructor for class org.tribuo.data.csv.CSVSaver
-
Builds a CSV saver using the supplied separator and quote.
- currentRow - Variable in class org.tribuo.data.columnar.ColumnarIterator
-
The current row.
D
- data - Variable in class org.tribuo.data.text.TextDataSource
-
The actual data read out of the text file.
- DataOptions - Class in org.tribuo.data
-
Options for working with training and test data in a CLI.
- DataOptions() - Constructor for class org.tribuo.data.DataOptions
- DataOptions.Delimiter - Enum in org.tribuo.data
-
The delimiters supported by CSV files in this options object.
- DataOptions.InputFormat - Enum in org.tribuo.data
-
The input formats supported by this options object.
- DatasetExplorer - Class in org.tribuo.data
-
A CLI for exploring a serialised
Dataset. - DatasetExplorer() - Constructor for class org.tribuo.data.DatasetExplorer
-
Constructs a dataset explorer.
- DatasetExplorer.DatasetExplorerOptions - Class in org.tribuo.data
-
Command line options.
- DatasetExplorerOptions() - Constructor for class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
- dataSource - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
-
Datasource to load from a config file
- DateExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and translates it to a
LocalDatebased on the specifiedDateTimeFormatter. - DateExtractor(String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
-
Constructs a date extractor that emits a LocalDate by applying the supplied format to the specified field.
- DateExtractor(String, String, String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
-
Constructs a date extractor that emits a LocalDate by applying the supplied format to the specified field.
- DateExtractor(String, String, DateTimeFormatter) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
-
Deprecated.
- DateFieldProcessor - Class in org.tribuo.data.columnar.processors.field
-
Processes a column that contains a date value.
- DateFieldProcessor(String, EnumSet<DateFieldProcessor.DateFeatureType>, String) - Constructor for class org.tribuo.data.columnar.processors.field.DateFieldProcessor
-
Constructs a field processor which parses a date from the specified field name using the supplied format string then extracts date features according to the supplied
EnumSet. - DateFieldProcessor(String, EnumSet<DateFieldProcessor.DateFeatureType>, String, String, String) - Constructor for class org.tribuo.data.columnar.processors.field.DateFieldProcessor
-
Constructs a field processor which parses a date from the specified field name using the supplied format string then extracts date features according to the supplied
EnumSet. - DateFieldProcessor.DateFeatureType - Enum in org.tribuo.data.columnar.processors.field
-
The types of date features which can be extracted.
- DAY - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The day.
- DAY_OF_QUARTER - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The day of the quarter.
- DAY_OF_WEEK - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The day of the week in ISO 8601.
- DAY_OF_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The day of the year.
- dbConfig - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
Name of the DBConfig to use
- DEFAULT_HASH_SEED - Static variable in class org.tribuo.data.text.impl.FeatureHasher
-
Default value for the hash function seed.
- DEFAULT_RESPONSE - Static variable in class org.tribuo.data.csv.CSVSaver
-
The default response column name.
- DEFAULT_VALUE_HASH_SEED - Static variable in class org.tribuo.data.text.impl.FeatureHasher
-
Default value for the value hash function seed.
- delimiter - Variable in class org.tribuo.data.DataOptions
-
Delimiter
- DirectoryFileSource<T extends Output<T>> - Class in org.tribuo.data.text
-
A data source for a somewhat-common format for text classification datasets: a top level directory that contains a number of subdirectories.
- DirectoryFileSource() - Constructor for class org.tribuo.data.text.DirectoryFileSource
-
for olcut
- DirectoryFileSource(Path, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.DirectoryFileSource
-
Creates a data source that will use the given feature extractor and document preprocessors on the data read from the files in the directories representing classes.
- DirectoryFileSource.DirectoryFileSourceProvenance - Class in org.tribuo.data.text
-
Provenance for
DirectoryFileSource. - DirectoryFileSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
-
Deserialization constructor.
- DocumentPreprocessor - Interface in org.tribuo.data.text
-
An interface for things that can pre-process documents before they are broken into features.
- DoubleExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and converts it to a double.
- DoubleExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.DoubleExtractor
-
Extracts a double value from the supplied field name.
- DoubleExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.DoubleExtractor
-
Extracts a double value from the supplied field name.
- DoubleFieldProcessor - Class in org.tribuo.data.columnar.processors.field
-
Processes a column that contains a real value.
- DoubleFieldProcessor(String) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
-
Constructs a field processor which extracts a single double valued feature from the specified field name.
- DoubleFieldProcessor(String, boolean) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
-
Constructs a field processor which extracts a single double valued feature from the specified field name.
- DoubleFieldProcessor(String, boolean, boolean) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
-
Constructs a field processor which extracts a single double valued feature from the specified field name.
E
- EmptyResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
-
A
ResponseProcessorthat always emits an empty optional. - EmptyResponseProcessor(OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
Constructs a response processor which never emits a response.
- equals(Object) - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
- equals(Object) - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.
- equals(Object) - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
- equals(Object) - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
- equals(Object) - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
- equals(Object) - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
- EVEN_OR_ODD_DAY - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The parity of the day of the year.
- EVEN_OR_ODD_MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The parity of the month.
- EVEN_OR_ODD_WEEK - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The parity of the week of the year as defined by ISO 8601.
- EVEN_OR_ODD_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The parity of the year.
- expandRegexMapping(Collection<String>) - Method in class org.tribuo.data.columnar.RowProcessor
-
Uses similar logic to
TransformationMap.validateTransformations(org.tribuo.FeatureMap)to check the regexes against the supplied list of field names. - expandRegexMapping(ImmutableFeatureMap) - Method in class org.tribuo.data.columnar.RowProcessor
-
Uses similar logic to
TransformationMap.validateTransformations(org.tribuo.FeatureMap)to check the regexes against the supplied feature map. - expandRegexMapping(Model<T>) - Method in class org.tribuo.data.columnar.RowProcessor
-
Uses similar logic to
TransformationMap.validateTransformations(org.tribuo.FeatureMap)to check the regexes against theImmutableFeatureMapcontained in the suppliedModel. - extract(LocalDate) - Method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
Applies this enum's extraction function to the supplied date.
- extract(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
- extract(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
- extract(ColumnarIterator.Row) - Method in interface org.tribuo.data.columnar.FieldExtractor
-
Returns Optional which is filled if extraction succeeded.
- extract(T, String) - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
- extract(T, String) - Method in interface org.tribuo.data.text.TextFeatureExtractor
-
Extracts an example from the supplied input text and output object.
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.DateExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.IntExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
- extractField(String) - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Extracts the field value, or returns
Optional.empty()if it failed to parse. - extractor - Variable in class org.tribuo.data.text.DirectoryFileSource
-
The extractor that we'll use to turn text into examples.
- extractor - Variable in class org.tribuo.data.text.TextDataSource
-
The extractor that we'll use to turn text into examples.
- extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
-
Separates this class's non-configurable fields from the configurable fields.
- extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
-
Separates out the configured and non-configured provenance values.
- extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
-
Splits the provenance into configured and non-configured values.
- extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
-
Separates out the configured and non-configured provenance values.
- extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
-
Separates out the configured and non-configured provenance values.
F
- FEATURE_VALUE - Static variable in class org.tribuo.data.columnar.processors.field.IdentityProcessor
-
The value of the emitted features.
- FeatureAggregator - Interface in org.tribuo.data.text
-
An interface for aggregating feature values into other values.
- FeatureHasher - Class in org.tribuo.data.text.impl
-
Hashes the feature names to reduce the dimensionality.
- FeatureHasher(int) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
-
Constructs a feature hasher using the supplied hash dimension.
- FeatureHasher(int, boolean) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
-
Constructs a feature hasher using the supplied hash dimension.
- FeatureHasher(int, int, int, boolean) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
-
Constructs a feature hasher using the supplied hash dimension and seed values.
- featureInfo(CommandInterpreter, String) - Method in class org.tribuo.data.DatasetExplorer
-
Shows information on a particular feature.
- FeatureProcessor - Interface in org.tribuo.data.columnar
-
Takes a list of columnar features and adds new features or removes existing features.
- FeatureTransformer - Interface in org.tribuo.data.text
-
A feature transformer maps a list of features to a new list of features.
- FIELD_NAME - Static variable in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
The field name this response processor looks for, which is ignored anyway as this processor always returns
Optional.empty(). - FieldExtractor<T> - Interface in org.tribuo.data.columnar
-
Extracts a value from a field to be placed in an
Example's metadata field. - fieldName - Variable in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
The field name to read.
- FieldProcessor - Interface in org.tribuo.data.columnar
-
An interface for things that process the columns in a data set.
- FieldProcessor.GeneratedFeatureType - Enum in org.tribuo.data.columnar
-
The types of generated features.
- fieldProcessorMap - Variable in class org.tribuo.data.columnar.RowProcessor
-
The map of field processors.
- FieldResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
-
A response processor that returns the value(s) in a given (set of) fields.
- FieldResponseProcessor(String, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Constructs a response processor which passes the field value through the output factory.
- FieldResponseProcessor(List<String>, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Constructs a response processor which passes the field value through the output factory.
- FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Constructs a response processor which passes the field value through the output factory.
- FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Constructs a response processor which passes the field value through the output factory.
- FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Constructs a response processor which passes the field value through the output factory.
- fields - Variable in class org.tribuo.data.columnar.ColumnarIterator
-
The column headers for this iterator.
- fileCompleter() - Method in class org.tribuo.data.DatasetExplorer
-
The filename completer.
- FIRST - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Select the first feature value in the list.
- FloatExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and converts it to a float.
- FloatExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.FloatExtractor
-
Extracts a float value from the supplied field name.
- FloatExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.FloatExtractor
-
Extracts a float value from the supplied field name.
- forEachRemaining(Consumer<? super ColumnarIterator.Row>) - Method in class org.tribuo.data.columnar.ColumnarIterator
G
- general - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
Data loading options.
- generateExample(long, Map<String, String>, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
-
Generate an
Examplefrom the supplied row. - generateExample(Map<String, String>, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
-
Generate an
Examplefrom the supplied row. - generateExample(ColumnarIterator.Row, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
-
Generate an
Examplefrom the supplied row. - generateFeatureName(String, String) - Static method in class org.tribuo.data.columnar.ColumnarFeature
-
Generates a feature name based on the field name and the name.
- generateFeatureName(String, String, String) - Static method in class org.tribuo.data.columnar.ColumnarFeature
-
Generates a feature name used for conjunction features.
- generateFeatures(Map<String, String>) - Method in class org.tribuo.data.columnar.RowProcessor
-
Generates the features from the supplied row.
- generateMetadata(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.RowProcessor
-
Generates the example metadata from the supplied row and index.
- getClassName() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.
- getColumnEntry() - Method in class org.tribuo.data.columnar.ColumnarFeature
-
Gets the columnEntry (i.e., the feature name produced by the
FieldExtractorwithout the fieldName). - getColumnNames() - Method in class org.tribuo.data.columnar.RowProcessor
-
The set of column names this will use for the feature processing.
- getConnection() - Method in class org.tribuo.data.sql.SQLDBConfig
-
Constructs a connection based on the object fields.
- getDescription() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns a description of the row processor and it's fields.
- getDescription() - Method in class org.tribuo.data.DatasetExplorer
- getFeatureProcessors() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns the set of
FeatureProcessors this RowProcessor uses. - getFeatureType() - Method in interface org.tribuo.data.columnar.FieldProcessor
-
Returns the feature type this FieldProcessor generates.
- getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
-
Gets the field name.
- getFieldName() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Gets the field name this extractor operates on.
- getFieldName() - Method in interface org.tribuo.data.columnar.FieldProcessor
-
Gets the field name this FieldProcessor uses.
- getFieldName() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
- getFieldName() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Deprecated.
- getFieldName() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
Deprecated.
- getFieldName() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Deprecated.
- getFieldName() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
-
Deprecated.
- getFieldName() - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Deprecated.use
ResponseProcessor.getFieldNames()and support multiple values instead. Gets the field name this ResponseProcessor uses. - getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
- getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- getFieldNames() - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Gets the field names this ResponseProcessor uses.
- getFieldProcessor(String) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Retrieves, if present, the fieldProcessor with the given name
- getFieldProcessors() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns the map of
FieldProcessors this RowProcessor uses. - getFields() - Method in class org.tribuo.data.columnar.ColumnarIterator
-
The immutable list of field names.
- getFields() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
-
Gets the field headers.
- getFirstFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
-
If it's a conjunction feature, return the first field name.
- getIndex() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
-
Gets the row index.
- getInstanceValues() - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
- getInstanceValues() - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
- getInstanceValues() - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
- getInstanceValues() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
- getInstanceValues() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
- getLowerMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
-
Returns the lower quartile value.
- getMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
-
Returns the median value.
- getMetadataName() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
- getMetadataName() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Gets the metadata key name.
- getMetadataName() - Method in interface org.tribuo.data.columnar.FieldExtractor
-
Gets the metadata key name.
- getMetadataTypes() - Method in class org.tribuo.data.columnar.ColumnarDataSource
-
Returns the metadata keys and value types that are created by this DataSource.
- getMetadataTypes() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns the metadata keys and value types that are extracted by this RowProcessor.
- getName() - Method in class org.tribuo.data.DatasetExplorer
- getNumNamespaces() - Method in interface org.tribuo.data.columnar.FieldProcessor
-
Binarised categoricals can be namespaced, where the field name is appended with "#<non-negative-int>" to denote the namespace.
- getOptionsDescription() - Method in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
- getOptionsDescription() - Method in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
- getOptionsDescription() - Method in class org.tribuo.data.DataOptions
- getOptionsDescription() - Method in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
- getOutputFactory() - Method in class org.tribuo.data.columnar.ColumnarDataSource
- getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
- getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- getOutputFactory() - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Gets the OutputFactory this ResponseProcessor uses.
- getOutputFactory() - Method in class org.tribuo.data.text.DirectoryFileSource
- getOutputFactory() - Method in class org.tribuo.data.text.TextDataSource
-
Returns the output factory used to convert the text input into an
Output. - getProvenance() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.IntExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.feature.UniqueProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.processors.response.Quartile
- getProvenance() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- getProvenance() - Method in class org.tribuo.data.columnar.RowProcessor
- getProvenance() - Method in class org.tribuo.data.csv.CSVDataSource
- getProvenance() - Method in class org.tribuo.data.sql.SQLDataSource
- getProvenance() - Method in class org.tribuo.data.sql.SQLDBConfig
- getProvenance() - Method in class org.tribuo.data.text.DirectoryFileSource
- getProvenance() - Method in class org.tribuo.data.text.impl.AverageAggregator
- getProvenance() - Method in class org.tribuo.data.text.impl.BasicPipeline
- getProvenance() - Method in class org.tribuo.data.text.impl.CasingPreprocessor
- getProvenance() - Method in class org.tribuo.data.text.impl.FeatureHasher
- getProvenance() - Method in class org.tribuo.data.text.impl.NewsPreprocessor
- getProvenance() - Method in class org.tribuo.data.text.impl.NgramProcessor
- getProvenance() - Method in class org.tribuo.data.text.impl.RegexPreprocessor
- getProvenance() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
- getProvenance() - Method in class org.tribuo.data.text.impl.SumAggregator
- getProvenance() - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
- getProvenance() - Method in class org.tribuo.data.text.impl.TokenPipeline
- getProvenance() - Method in class org.tribuo.data.text.impl.UniqueAggregator
- getRegexFieldProcessor(String) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
Retrieves, if present, the regexFieldProcessor with the given regex
- getResponseProcessor() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns the response processor this RowProcessor uses.
- getRow() - Method in class org.tribuo.data.columnar.ColumnarIterator
-
Returns the next row of data based on internal state stored by the implementor, or
Optional.empty()if there is no more data. - getRow() - Method in class org.tribuo.data.csv.CSVIterator
- getRow() - Method in class org.tribuo.data.sql.ResultSetIterator
- getRowData() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
-
Gets the row data.
- getSecondFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
-
If it's a conjunction feature, return the second field name.
- getStatement() - Method in class org.tribuo.data.sql.SQLDBConfig
-
Constructs a statement based on the object fields.
- getUpperMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
-
The upper quartile value.
- getValueType() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.IntExtractor
- getValueType() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
- getValueType() - Method in interface org.tribuo.data.columnar.FieldExtractor
-
Gets the class of the value produced by this extractor.
- GROUPS - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
-
Triggers feature generation for each matching group in the string.
H
- handleDoc(String) - Method in class org.tribuo.data.text.TextDataSource
-
A method that can be overridden to do different things to each document that we've read.
- hashCode() - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
- hashCode() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.
- hashCode() - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
- hashCode() - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
- hashCode() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
- hashCode() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
- hashDim - Variable in class org.tribuo.data.DataOptions
-
Hashing dimension used for standard text format.
- hasNext() - Method in class org.tribuo.data.columnar.ColumnarIterator
I
- IdentityExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and emits it as a String.
- IdentityExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IdentityExtractor
-
Extracts the String value from the supplied field.
- IdentityExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.IdentityExtractor
-
Extracts the String value from the supplied field.
- IdentityProcessor - Class in org.tribuo.data.columnar.processors.field
-
A
FieldProcessorwhich converts the field name and value into a feature with a value ofIdentityProcessor.FEATURE_VALUE. - IdentityProcessor(String) - Constructor for class org.tribuo.data.columnar.processors.field.IdentityProcessor
-
Constructs a field processor which emits a single feature with a specific value and uses the field name and field value as the feature name.
- IndexExtractor - Class in org.tribuo.data.columnar.extractors
-
An Extractor with special casing for loading the index from a Row.
- IndexExtractor() - Constructor for class org.tribuo.data.columnar.extractors.IndexExtractor
-
Extracts the index writing to the default metadata field name
Example.NAME. - IndexExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IndexExtractor
-
Extracts the index, writing to the supplied metadata field name.
- inputFormat - Variable in class org.tribuo.data.DataOptions
-
Loads the data using the specified format.
- inputPath - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
SQL File to run as a query, defaults to stdin
- inputPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
-
Input data file in standard text format.
- INTEGER - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Ordered integral feature values (e.g.
- IntExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and converts it to a int.
- IntExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IntExtractor
-
Extracts a int value from the supplied field name.
- IntExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.IntExtractor
-
Extracts a int value from the supplied field name.
- isConfigured() - Method in class org.tribuo.data.columnar.RowProcessor
-
Returns true if the regexes have been expanded into field processors.
- iterator() - Method in class org.tribuo.data.columnar.ColumnarDataSource
- iterator() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.
- iterator() - Method in class org.tribuo.data.text.DirectoryFileSource
- iterator() - Method in class org.tribuo.data.text.TextDataSource
J
- JOINER - Static variable in class org.tribuo.data.columnar.ColumnarFeature
-
The joiner between the field name and feature name.
L
- LAST - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Select the last feature value in the list.
- LIBSVM - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
LibSVM/svm-light format data.
- load(Path, String) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv file then wraps it in a dataset.
- load(Path, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv file then wraps it in a dataset.
- load(Path, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv file then wraps it in a dataset.
- load(Path, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv file then wraps it in a dataset.
- load(OutputFactory<T>) - Method in class org.tribuo.data.DataOptions
-
Loads the training and testing data from
DataOptions.trainingPathandDataOptions.testingPathaccording to the other parameters specified in this class. - loadDataset(CommandInterpreter, File, boolean) - Method in class org.tribuo.data.DatasetExplorer
-
Loads a serialized dataset.
- loadDataSource(URL, String) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(URL, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(URL, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(URL, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(Path, String) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(Path, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(Path, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- loadDataSource(Path, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
-
Loads a DataSource from the specified csv path.
- LOWERCASE - Enum constant in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
-
Lowercase the input text.
M
- main(String[]) - Static method in class org.tribuo.data.CompletelyConfigurableTrainTest
- main(String[]) - Static method in class org.tribuo.data.ConfigurableTrainTest
- main(String[]) - Static method in class org.tribuo.data.DatasetExplorer
-
Runs a dataset explorer.
- main(String[]) - Static method in class org.tribuo.data.PreprocessAndSerialize
-
Run the PreprocessAndSerialize CLI.
- main(String[]) - Static method in class org.tribuo.data.sql.SQLToCSV
-
Reads an SQL query from the standard input and writes the results of the query to the standard output.
- main(String[]) - Static method in class org.tribuo.data.text.SplitTextData
-
Runs the SplitTextData CLI.
- map(String, List<Feature>) - Method in interface org.tribuo.data.text.FeatureTransformer
-
Transforms features into a new list of features
- map(String, List<Feature>) - Method in class org.tribuo.data.text.impl.FeatureHasher
- MATCH_ALL - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
-
Triggers feature generation if the whole string matches.
- MATCH_CONTAINS - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
-
Triggers feature generation if the string contains a match.
- MAX - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Select the maximum feature value in the list.
- metadataName - Variable in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
The metadata key to emit.
- MIN - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Select the minimum feature value in the list.
- minCount - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Remove features which occur fewer than m times.
- minCount - Variable in class org.tribuo.data.DataOptions
-
Minimum cardinality of the features.
- minCount(CommandInterpreter, int) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the number of features which occurred more than minCount times in the dataset.
- modelFilename - Variable in class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
-
Dataset file to load.
- modelOutputProtobuf - Variable in class org.tribuo.data.DataOptions
-
Write the model out as a protobuf.
- MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The month.
N
- NAMESPACE - Static variable in interface org.tribuo.data.columnar.FieldProcessor
-
The namespacing separator.
- NEGATIVE_NAME - Static variable in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
The default negative response.
- NewsPreprocessor - Class in org.tribuo.data.text.impl
-
A document pre-processor for 20 newsgroup data.
- NewsPreprocessor() - Constructor for class org.tribuo.data.text.impl.NewsPreprocessor
-
Constructor.
- next() - Method in class org.tribuo.data.columnar.ColumnarIterator
- ngram - Variable in class org.tribuo.data.DataOptions
-
Ngram size to generate when using standard text format.
- NgramProcessor - Class in org.tribuo.data.text.impl
-
A text processor that will generate token ngrams of a particular size.
- NgramProcessor(Tokenizer, int, double) - Constructor for class org.tribuo.data.text.impl.NgramProcessor
-
Creates a processor that will generate token ngrams of size
n. - numExamples(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the number of examples in this dataset.
- numFeatures(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the number of features in this dataset.
- numFolds - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
The number of cross validation folds.
O
- OffsetDateTimeExtractor - Class in org.tribuo.data.columnar.extractors
-
Extracts the field value and translates it to an
OffsetDateTimebased on the specifiedDateTimeFormatter. - OffsetDateTimeExtractor(String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
-
Constructs a date time extractor that emits an OffsetDateTime by applying the supplied format to the specified field.
- OffsetDateTimeExtractor(String, String, String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
-
Constructs a date time extractor that emits an OffsetDateTime by applying the supplied format to the specified field.
- org.tribuo.data - package org.tribuo.data
-
Provides classes for loading in data from disk, processing it into examples, and splitting datasets for things like cross-validation and train-test splits.
- org.tribuo.data.columnar - package org.tribuo.data.columnar
-
Provides classes for processing columnar data and generating
Examples. - org.tribuo.data.columnar.extractors - package org.tribuo.data.columnar.extractors
-
Provides implementations of
FieldExtractor. - org.tribuo.data.columnar.processors.feature - package org.tribuo.data.columnar.processors.feature
-
Provides implementations of
FeatureProcessor. - org.tribuo.data.columnar.processors.field - package org.tribuo.data.columnar.processors.field
-
Provides implementations of
FieldProcessor. - org.tribuo.data.columnar.processors.response - package org.tribuo.data.columnar.processors.response
-
Provides implementations of
ResponseProcessor. - org.tribuo.data.csv - package org.tribuo.data.csv
-
Provides classes which can load columnar data (using a
RowProcessor) from a CSV (or other character delimited format) file. - org.tribuo.data.sql - package org.tribuo.data.sql
-
Provides classes which can load columnar data (using a
RowProcessor) from a SQL source. - org.tribuo.data.text - package org.tribuo.data.text
- org.tribuo.data.text.impl - package org.tribuo.data.text.impl
-
Provides implementations of text data processors.
- output - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
-
path to serialize the dataset
- outputFactory - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
The output factory to construct.
- outputFactory - Variable in class org.tribuo.data.text.DirectoryFileSource
-
The factory that converts a String into an
Output. - outputFactory - Variable in class org.tribuo.data.text.TextDataSource
-
The factory that converts a String into an
Output. - outputInfo(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the output information.
- outputPath - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Path to serialize model to.
- outputPath - Variable in class org.tribuo.data.DataOptions
-
Path to serialize model to.
- outputPath - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
File to write query results as CSV, defaults to stdout
- outputRequired - Variable in class org.tribuo.data.columnar.ColumnarDataSource
-
Is an output required from each row?
P
- parseLine(String, int) - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
-
Parses a line in Tribuo's default text format.
- partialExpandRegexMapping(Collection<String>) - Method in class org.tribuo.data.columnar.RowProcessor
-
Caveat Implementor! This method contains the logic of
RowProcessor.expandRegexMapping(org.tribuo.Model<T>)without any of the checks that ensure the RowProcessor is in a valid state. - password - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
Password for the SQL database
- path - Variable in class org.tribuo.data.text.TextDataSource
-
The path that data was read from.
- POSITIVE_NAME - Static variable in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
The default positive response.
- postConfig() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- postConfig() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- postConfig() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- postConfig() - Method in class org.tribuo.data.columnar.RowProcessor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.csv.CSVDataSource
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.sql.SQLDBConfig
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.BasicPipeline
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.FeatureHasher
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.NgramProcessor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.RegexPreprocessor
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
-
Used by the OLCUT configuration system, and should not be called by external code.
- postConfig() - Method in class org.tribuo.data.text.impl.TokenPipeline
-
Used by the OLCUT configuration system, and should not be called by external code.
- PreprocessAndSerialize - Class in org.tribuo.data
-
Reads in a Datasource, processes all the data, and writes it out as a serialized dataset.
- PreprocessAndSerialize.PreprocessAndSerializeOptions - Class in org.tribuo.data
-
Command line options.
- PreprocessAndSerializeOptions() - Constructor for class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
- preprocessors - Variable in class org.tribuo.data.text.DirectoryFileSource
-
Document preprocessors that should be run on the documents that make up this data set.
- preprocessors - Variable in class org.tribuo.data.text.TextDataSource
-
Document preprocessors that should be run on the documents that make up this data set.
- process(String) - Method in interface org.tribuo.data.columnar.FieldProcessor
-
Processes the field value and generates a (possibly empty) list of
ColumnarFeatures. - process(String) - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- process(String) - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- process(String) - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- process(String) - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- process(String) - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
- process(String) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Deprecated.
- process(String) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
Deprecated.
- process(String) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Deprecated.
- process(String) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
-
Deprecated.
- process(String) - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Deprecated.use
ResponseProcessor.process(List)and support multiple values instead. Returns Optional.empty() if it failed to process out a response. - process(String) - Method in class org.tribuo.data.text.impl.NgramProcessor
- process(String) - Method in interface org.tribuo.data.text.TextProcessor
-
Extracts features from the supplied text.
- process(String, String) - Method in class org.tribuo.data.text.impl.BasicPipeline
- process(String, String) - Method in class org.tribuo.data.text.impl.NgramProcessor
- process(String, String) - Method in class org.tribuo.data.text.impl.TokenPipeline
- process(String, String) - Method in interface org.tribuo.data.text.TextPipeline
-
Extracts a list of features from the supplied text, using the tag to prepend the feature names.
- process(String, String) - Method in interface org.tribuo.data.text.TextProcessor
-
Extracts features from the supplied text.
- process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
This method always returns
Optional.empty(). - process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- process(List<String>) - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Returns Optional.empty() if it failed to process out a response.This method has a default implementation for backwards compatibility with Tribuo 4.0 and 4.1.
- process(List<ColumnarFeature>) - Method in interface org.tribuo.data.columnar.FeatureProcessor
-
Processes a list of
ColumnarFeatures, transforming it by adding conjunctions or removing unnecessary features. - process(List<ColumnarFeature>) - Method in class org.tribuo.data.columnar.processors.feature.UniqueProcessor
- processDoc(String) - Method in interface org.tribuo.data.text.DocumentPreprocessor
-
Processes the content of part of a document stored as a string, returning a new string.
- processDoc(String) - Method in class org.tribuo.data.text.impl.CasingPreprocessor
- processDoc(String) - Method in class org.tribuo.data.text.impl.NewsPreprocessor
- processDoc(String) - Method in class org.tribuo.data.text.impl.RegexPreprocessor
- protobufFormat - Variable in class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
-
Load the model from a protobuf.
- protobufFormat - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
-
Save the dataset as a protobuf.
- provenance - Variable in class org.tribuo.data.text.impl.SimpleTextDataSource
-
The data source provenance.
Q
- Quartile - Class in org.tribuo.data.columnar.processors.response
-
A quartile to split data into 4 chunks.
- Quartile(double, double, double) - Constructor for class org.tribuo.data.columnar.processors.response.Quartile
-
Constructs a quartile with the specified values.
- QuartileResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
-
Processes the response into quartiles and emits them as classification outputs.
- QuartileResponseProcessor(String, String, Quartile, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
-
Constructs a response processor which emits 4 distinct bins for the output factory to process.
- QuartileResponseProcessor(List<String>, List<Quartile>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
-
Constructs a response processor which emits 4 distinct bins for the output factory to process.
- QUOTE - Static variable in class org.tribuo.data.csv.CSVIterator
-
Default quote character.
R
- rawLines - Variable in class org.tribuo.data.text.impl.SimpleStringDataSource
-
Used because OLCUT doesn't support generic Iterables.
- read() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
- read() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
- read() - Method in class org.tribuo.data.text.TextDataSource
-
Reads the data from the Path.
- REAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Real valued features.
- RegexFieldProcessor - Class in org.tribuo.data.columnar.processors.field
-
A
FieldProcessorwhich applies a regex to a field and generatesColumnarFeatures based on the matches. - RegexFieldProcessor(String, String, EnumSet<RegexFieldProcessor.Mode>) - Constructor for class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
-
Constructs a field processor which emits features when the field value matches the supplied regex.
- RegexFieldProcessor(String, Pattern, EnumSet<RegexFieldProcessor.Mode>) - Constructor for class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
-
Constructs a field processor which emits features when the field value matches the supplied regex.
- RegexFieldProcessor.Mode - Enum in org.tribuo.data.columnar.processors.field
-
Matching mode.
- regexMappingProcessors - Variable in class org.tribuo.data.columnar.RowProcessor
-
The map of regexes to field processors.
- RegexPreprocessor - Class in org.tribuo.data.text.impl
-
A simple document preprocessor which applies regular expressions to the input.
- RegexPreprocessor(List<String>, List<String>) - Constructor for class org.tribuo.data.text.impl.RegexPreprocessor
-
Construct a regex preprocessor.
- replaceNewlinesWithSpaces - Variable in class org.tribuo.data.columnar.RowProcessor
-
Should newlines be replaced with spaces before processing.
- responseProcessor - Variable in class org.tribuo.data.columnar.RowProcessor
-
The processor which extracts the response.
- ResponseProcessor<T extends Output<T>> - Interface in org.tribuo.data.columnar
-
An interface that will take the response field and produce an
Output. - ResultSetIterator - Class in org.tribuo.data.sql
-
An iterator over a ResultSet returned from JDBC.
- ResultSetIterator(ResultSet) - Constructor for class org.tribuo.data.sql.ResultSetIterator
-
Construct a result set iterator over the supplied result set.
- ResultSetIterator(ResultSet, int) - Constructor for class org.tribuo.data.sql.ResultSetIterator
-
Constructs a result set iterator over the supplied result set using the specified fetch buffer size.
- Row(long, List<String>, Map<String, String>) - Constructor for class org.tribuo.data.columnar.ColumnarIterator.Row
-
Constructs a row from a columnar source.
- rowIterator() - Method in class org.tribuo.data.columnar.ColumnarDataSource
-
The iterator that emits
ColumnarIterator.Rowobjects from the underlying data source. - rowIterator() - Method in class org.tribuo.data.csv.CSVDataSource
- rowIterator() - Method in class org.tribuo.data.sql.SQLDataSource
- rowProcessor - Variable in class org.tribuo.data.columnar.ColumnarDataSource
-
The RowProcessor to use.
- rowProcessor - Variable in class org.tribuo.data.DataOptions
-
The name of the row processor from the config file.
- RowProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar
-
A processor which takes a Map of String to String and returns an
Example. - RowProcessor() - Constructor for class org.tribuo.data.columnar.RowProcessor
-
For olcut.
- RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Deprecated.Prefer
RowProcessor.Builderto many-argument constructors - RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Map<String, FieldProcessor>, Set<FeatureProcessor>, boolean) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Deprecated.Prefer
RowProcessor.Builderto many-argument constructors - RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Deprecated.Prefer
RowProcessor.Builderto many-argument constructors - RowProcessor(List<FieldExtractor<?>>, ResponseProcessor<T>, Map<String, FieldProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Deprecated.Prefer
RowProcessor.Builderto many-argument constructors - RowProcessor(ResponseProcessor<T>, Map<String, FieldProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Constructs a RowProcessor using the supplied responseProcessor to extract the response variable, and the supplied fieldProcessorMap to control which fields are parsed and how they are parsed.
- RowProcessor(ResponseProcessor<T>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
-
Constructs a RowProcessor using the supplied responseProcessor to extract the response variable, and the supplied fieldProcessorMap to control which fields are parsed and how they are parsed.
- RowProcessor.Builder<T extends Output<T>> - Class in org.tribuo.data.columnar
-
Builder for
RowProcessor.
S
- save(Path, Dataset<T>, String) - Method in class org.tribuo.data.csv.CSVSaver
-
Saves the dataset to the specified path.
- save(Path, Dataset<T>, Set<String>) - Method in class org.tribuo.data.csv.CSVSaver
-
Saves the dataset to the specified path.
- saveCSV(CommandInterpreter, String) - Method in class org.tribuo.data.DatasetExplorer
-
Saves out the dataset as a CSV file.
- saveModel(Model<T>) - Method in class org.tribuo.data.DataOptions
-
Saves the model out to the path in
DataOptions.outputPath. - scaleFeatures - Variable in class org.tribuo.data.DataOptions
-
Scales the features to the range 0-1 independently.
- scaleIncZeros - Variable in class org.tribuo.data.DataOptions
-
Includes implicit zeros in the scale range calculation.
- seed - Variable in class org.tribuo.data.DataOptions
-
RNG seed.
- seed - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
-
Seed for the RNG.
- SEMICOLON - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
-
Semicolon separator.
- SEPARATOR - Static variable in class org.tribuo.data.csv.CSVIterator
-
Default separator character.
- SERIALIZED - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
Serialized Tribuo datasets.
- SERIALIZED_PROTOBUF - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
Protobuf serialized Tribuo datasets.
- setFeatureProcessors(Set<FeatureProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
The
FeatureProcessors to apply to each extracted feature list. - setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
-
Deprecated.
- setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
-
Deprecated.
- setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
-
Deprecated.
- setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
-
Deprecated.
- setFieldName(String) - Method in interface org.tribuo.data.columnar.ResponseProcessor
-
Deprecated.Response processors should be immutable; downstream objects assume that they are Set the field name this ResponseProcessor uses.
- setFieldProcessors(Iterable<FieldProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
the
FieldProcessors to apply to each row. - setMetadataExtractors(List<FieldExtractor<?>>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
If set, the supplied
FieldExtractors will be run for each example, populatingExample.getMetadata(). - setRegexMappingProcessors(Map<String, FieldProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
A map from strings (interpreted as regular expressions by
Pattern.compile(String)) toFieldProcessors such that if a field name matches a regular expression, the corresponding FieldProcessor is used to process it. - setReplaceNewLinesWithSpaces(boolean) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
If true, replaces newlines in fields with spaces before passing them to
FieldProcessors. - setWeightExtractor(FieldExtractor<Float>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
-
If set, the constructed
RowProcessorwill add the extracted floats into theExample.setWeight(float)s. - showOutputStats(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the output statistics.
- showProvenance(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
-
Shows the dataset provenance.
- SimpleFieldExtractor<T> - Class in org.tribuo.data.columnar.extractors
-
Extracts a value from a single field to be placed in an
Example's metadata field. - SimpleFieldExtractor() - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
For olcut.
- SimpleFieldExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Constructs a simple field extractor which reads from the supplied field name and writes out to a metadata field with the same name.
- SimpleFieldExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
-
Constructs a simple field extractor with the supplied field name and metadata field name.
- SimpleStringDataSource<T extends Output<T>> - Class in org.tribuo.data.text.impl
-
A version of
SimpleTextDataSourcethat accepts aListof Strings. - SimpleStringDataSource(List<String>, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleStringDataSource
-
Constructs a simple string data source from the supplied lines.
- SimpleStringDataSource.SimpleStringDataSourceProvenance - Class in org.tribuo.data.text.impl
-
Provenance for
SimpleStringDataSource. - SimpleStringDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
-
Deserialization constructor.
- SimpleTextDataSource<T extends Output<T>> - Class in org.tribuo.data.text.impl
-
A dataset for a simple data format for text classification experiments.
- SimpleTextDataSource() - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
-
for olcut
- SimpleTextDataSource(File, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
-
Constructs a simple text data source by reading lines from the supplied file.
- SimpleTextDataSource(Path, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
-
Constructs a simple text data source by reading lines from the supplied path.
- SimpleTextDataSource(OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
-
Cosntructs a data source without a path.
- SimpleTextDataSource.SimpleTextDataSourceProvenance - Class in org.tribuo.data.text.impl
-
Provenance for
SimpleTextDataSource. - SimpleTextDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
-
Deserialization constructor.
- splitFraction - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
-
Split fraction.
- SplitTextData - Class in org.tribuo.data.text
-
Splits data in our standard text format into training and testing portions.
- SplitTextData() - Constructor for class org.tribuo.data.text.SplitTextData
- SplitTextData.TrainTestSplitOptions - Class in org.tribuo.data.text
-
Command line options.
- SQLDataSource<T extends Output<T>> - Class in org.tribuo.data.sql
-
A
DataSourcefor loading columnar data from a database and applyingFieldProcessors to it. - SQLDataSource(String, SQLDBConfig, OutputFactory<T>, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.sql.SQLDataSource
-
Constructs a SQLDataSource.
- SQLDataSource.SQLDataSourceProvenance - Class in org.tribuo.data.sql
-
Provenance for
SQLDataSource. - SQLDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
-
Deserialization constructor.
- SQLDBConfig - Class in org.tribuo.data.sql
-
N.B.
- SQLDBConfig(String, String, String, String, String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
-
Constructs a SQL database configuration.
- SQLDBConfig(String, String, String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
-
Constructs a SQL database configuration.
- SQLDBConfig(String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
-
Constructs a SQL database configuration.
- SQLToCSV - Class in org.tribuo.data.sql
-
Read an SQL query in on the standard input, write a CSV file containing the results to the standard output.
- SQLToCSV() - Constructor for class org.tribuo.data.sql.SQLToCSV
- SQLToCSV.SQLToCSVOptions - Class in org.tribuo.data.sql
-
Command line options.
- SQLToCSVOptions() - Constructor for class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
- startShell() - Method in class org.tribuo.data.DatasetExplorer
-
Start the command shell
- SUM - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Add together all the feature values.
- SumAggregator - Class in org.tribuo.data.text.impl
-
A feature aggregator that aggregates occurrence counts across a number of feature lists.
- SumAggregator() - Constructor for class org.tribuo.data.text.impl.SumAggregator
T
- TAB - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
-
Tab separator.
- termCounting - Variable in class org.tribuo.data.DataOptions
-
Use term counts instead of boolean when using the standard text format.
- testingPath - Variable in class org.tribuo.data.DataOptions
-
Path to the testing file.
- testSource - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load the testing DataSource from the config file.
- TEXT - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Text features.
- TEXT - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
-
Text data in Tribuo's standard format (i.e., each line is "output ## text data").
- TextDataSource<T extends Output<T>> - Class in org.tribuo.data.text
-
A base class for textual data sets.
- TextDataSource() - Constructor for class org.tribuo.data.text.TextDataSource
-
for olcut
- TextDataSource(File, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.TextDataSource
-
Creates a text data set by reading it from a file.
- TextDataSource(Path, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.TextDataSource
-
Creates a text data set by reading it from a path.
- TextFeatureExtractor<T extends Output<T>> - Interface in org.tribuo.data.text
-
An interface for things that take text and turn them into examples that we can use to train or evaluate a classifier.
- TextFeatureExtractorImpl<T extends Output<T>> - Class in org.tribuo.data.text.impl
- TextFeatureExtractorImpl(TextPipeline) - Constructor for class org.tribuo.data.text.impl.TextFeatureExtractorImpl
-
Constructs a text feature extractor wrapping the supplied text pipeline.
- TextFieldProcessor - Class in org.tribuo.data.columnar.processors.field
-
A
FieldProcessorwhich takes a text field and runs aTextPipelineon it to generate features. - TextFieldProcessor(String, TextPipeline) - Constructor for class org.tribuo.data.columnar.processors.field.TextFieldProcessor
-
Constructs a field processor which uses the supplied text pipeline to process the field value.
- TextPipeline - Interface in org.tribuo.data.text
-
A pipeline that takes a String and returns a List of
Features. - TextProcessingException - Exception in org.tribuo.data.text
-
An exception thrown by the text processing system.
- TextProcessingException(String) - Constructor for exception org.tribuo.data.text.TextProcessingException
-
Creates a TextProcessingException with the specified message.
- TextProcessingException(String, Throwable) - Constructor for exception org.tribuo.data.text.TextProcessingException
-
Creates a TextProcessingException wrapping the supplied throwable with the specified message.
- TextProcessingException(Throwable) - Constructor for exception org.tribuo.data.text.TextProcessingException
-
Creates a TextProcessingException wrapping the supplied throwable.
- TextProcessor - Interface in org.tribuo.data.text
-
A TextProcessor takes some text and optionally a feature tag and generates a list of
Features from that text. - TokenPipeline - Class in org.tribuo.data.text.impl
-
A pipeline for generating ngram features.
- TokenPipeline(Tokenizer, int, boolean) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
-
Creates a new token pipeline.
- TokenPipeline(Tokenizer, int, boolean, int) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
-
Creates a new token pipeline.
- TokenPipeline(Tokenizer, int, boolean, int, boolean) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
-
Creates a new token pipeline.
- toString() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
- toString() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
- toString() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
- toString() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
- toString() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
- toString() - Method in class org.tribuo.data.columnar.processors.response.Quartile
- toString() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
- toString() - Method in class org.tribuo.data.columnar.RowProcessor
- toString() - Method in class org.tribuo.data.csv.CSVDataSource
- toString() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
-
Deprecated.
- toString() - Method in class org.tribuo.data.sql.SQLDataSource
- toString() - Method in class org.tribuo.data.sql.SQLDBConfig
- toString() - Method in class org.tribuo.data.text.DirectoryFileSource
- toString() - Method in class org.tribuo.data.text.impl.BasicPipeline
- toString() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
- toString() - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
- toString() - Method in class org.tribuo.data.text.impl.TokenPipeline
- toString() - Method in class org.tribuo.data.text.TextDataSource
- trainer - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load a trainer from the config file.
- trainer - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load a trainer from the config file.
- trainingPath - Variable in class org.tribuo.data.DataOptions
-
Path to the training file.
- trainPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
-
Output training data file.
- trainSource - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load the training DataSource from the config file.
- TrainTestSplitOptions() - Constructor for class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
- transformationMap - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load a transformation map from the config file.
- transformationMap - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
-
Load a transformation map from the config file.
- tryAdvance(Consumer<? super ColumnarIterator.Row>) - Method in class org.tribuo.data.columnar.ColumnarIterator
U
- UniqueAggregator - Class in org.tribuo.data.text.impl
-
Aggregates feature tokens, generating unique features.
- UniqueAggregator() - Constructor for class org.tribuo.data.text.impl.UniqueAggregator
-
Constructs an aggregator that replaces all features with the same name with a single feature with the last observed value of that feature.
- UniqueAggregator(double) - Constructor for class org.tribuo.data.text.impl.UniqueAggregator
-
Constructs an aggregator that replaces all features with the same name with a single feature with the specified value.
- UniqueProcessor - Class in org.tribuo.data.columnar.processors.feature
-
Processes a feature list, aggregating all the feature values with the same name.
- UniqueProcessor(UniqueProcessor.UniqueType) - Constructor for class org.tribuo.data.columnar.processors.feature.UniqueProcessor
-
Creates a UniqueProcessor using the specified reduction operation.
- UniqueProcessor.UniqueType - Enum in org.tribuo.data.columnar.processors.feature
-
The type of reduction operation to perform.
- UPPERCASE - Enum constant in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
-
Uppercase the input text.
- username - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
-
Username for the SQL database
V
- validationPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
-
Output validation data file.
- value - Variable in enum org.tribuo.data.DataOptions.Delimiter
-
The delimiter character.
- valueOf(String) - Static method in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.DataOptions.Delimiter
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.DataOptions.InputFormat
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.DataOptions.Delimiter
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.DataOptions.InputFormat
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
-
Returns an array containing the constants of this enum type, in the order they are declared.
W
- WEEK_OF_MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The week of the month, as defined by ISO 8601 semantics for week of the year.
- WEEK_OF_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The week of the year in ISO 8601.
- weightExtractor - Variable in class org.tribuo.data.columnar.RowProcessor
-
The extractor for the example weight.
- wrapFeatures(String, List<Feature>) - Static method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
-
Convert the
Features from a text pipeline intoColumnarFeatures with the right field name.
Y
- YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
-
The year.
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
When using regexMappingProcessors, RowProcessor is stateful in a way that can sometimes make it fail the second time it is used. Concretely:
RowProcessor rp; Dataset ds1 = new MutableDataset(new CSVDataSource(csvfile1, rp)); Dataset ds2 = new MutableDataset(new CSVDataSource(csvfile2, rp)); // this may fail due to state in rpThis method returns a RowProcessor with clean state and the same configuration as this row processor.