Index

A B C D E F G H I J L M N O P Q R S T U V W Y 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form

A

addFeatureProcessor(FeatureProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Add a single feature processor to the builder.
addFieldProcessor(FieldProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Add a single field processor to the builder.
addMetadataExtractor(FieldExtractor<?>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Add a single metadata extractor to the builder.
addRegexMappingProcessor(String, FieldProcessor) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Add a single regex FieldProcessor mapping to the builder.
aggregate(List<Feature>) - Method in interface org.tribuo.data.text.FeatureAggregator
Aggregates feature values with the same names.
aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.AverageAggregator
 
aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.SumAggregator
 
aggregate(List<Feature>) - Method in class org.tribuo.data.text.impl.UniqueAggregator
 
applyCase(String) - Method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
Apply the appropriate casing operation.
AverageAggregator - Class in org.tribuo.data.text.impl
A feature aggregator that averages feature values across a feature list.
AverageAggregator() - Constructor for class org.tribuo.data.text.impl.AverageAggregator
 

B

BasicPipeline - Class in org.tribuo.data.text.impl
An example implementation of TextPipeline.
BasicPipeline(Tokenizer, int) - Constructor for class org.tribuo.data.text.impl.BasicPipeline
Constructs a basic text pipeline which tokenizes the input and generates word n-gram features in the range 1 to ngram.
BINARISED_CATEGORICAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Categoricals binarised into separate features.
BinaryResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
A ResponseProcessor that takes a single value of the field as the positive class and all other values as the negative class.
BinaryResponseProcessor(String, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
BinaryResponseProcessor(List<String>, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
BinaryResponseProcessor(List<String>, List<String>, OutputFactory<T>, String, String, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Constructs a binary response processor which emits a positive value for a single string and a negative value for all other field values.
build(ResponseProcessor<T>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Construct the RowProcessor represented by this builder's state.
Builder() - Constructor for class org.tribuo.data.columnar.RowProcessor.Builder
Builder for RowProcessor, see RowProcessor constructors for argument details.

C

cacheProvenance() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
 
cacheProvenance() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
Computes the provenance.
CALENDAR_QUARTER - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The calendar quarter of the year.
CasingPreprocessor - Class in org.tribuo.data.text.impl
A document preprocessor which uppercases or lowercases the input.
CasingPreprocessor(CasingPreprocessor.CasingOperation) - Constructor for class org.tribuo.data.text.impl.CasingPreprocessor
Construct a casing preprocessor.
CasingPreprocessor.CasingOperation - Enum in org.tribuo.data.text.impl
The possible casing operations.
CATEGORICAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Unordered categorical features with the values converted into doubles.
close() - Method in class org.tribuo.data.csv.CSVIterator
 
close() - Method in class org.tribuo.data.sql.SQLDataSource
 
COLUMNAR - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
A CSV file parsed using a configured RowProcessor.
ColumnarDataSource<T extends Output<T>> - Class in org.tribuo.data.columnar
A ConfigurableDataSource base class which takes columnar data (e.g., csv or DB table rows) and generates Examples.
ColumnarDataSource() - Constructor for class org.tribuo.data.columnar.ColumnarDataSource
For OLCUT.
ColumnarDataSource(OutputFactory<T>, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.columnar.ColumnarDataSource
Constructs a columnar data source with the specified parameters.
ColumnarFeature - Class in org.tribuo.data.columnar
A Feature with extra bookkeeping for use inside the columnar package.
ColumnarFeature(String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
Constructs a ColumnarFeature from the field name.
ColumnarFeature(String, String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
Constructs a ColumnarFeature from the field name, column entry and value.
ColumnarFeature(String, String, String, double) - Constructor for class org.tribuo.data.columnar.ColumnarFeature
Constructs a ColumnarFeature which is the conjunction of features from two fields.
ColumnarIterator - Class in org.tribuo.data.columnar
An abstract class for iterators that read data in to a columnar format, usually from a file of some kind.
ColumnarIterator() - Constructor for class org.tribuo.data.columnar.ColumnarIterator
Constructs a ColumnarIterator wrapped around a buffering spliterator.
ColumnarIterator(int, int, long) - Constructor for class org.tribuo.data.columnar.ColumnarIterator
Constructs a ColumnarIterator wrapped around a buffering spliterator.
ColumnarIterator.Row - Class in org.tribuo.data.columnar
A representation of a row of untyped data from a columnar data source.
COMMA - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
Comma separator.
CompletelyConfigurableTrainTest - Class in org.tribuo.data
Build and run a predictor for a standard dataset.
CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions - Class in org.tribuo.data
Command line options.
ConfigurableTrainTest - Class in org.tribuo.data
Build and run a predictor for a standard dataset.
ConfigurableTrainTest.ConfigurableTrainTestOptions - Class in org.tribuo.data
Command line options.
ConfigurableTrainTestOptions() - Constructor for class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
 
ConfigurableTrainTestOptions() - Constructor for class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
 
configured - Variable in class org.tribuo.data.columnar.RowProcessor
Has this row processor been configured?
CONJUNCTION - Static variable in class org.tribuo.data.columnar.ColumnarFeature
The string used as the field name of conjunction features.
connString - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
Connection string to the SQL database
copy() - Method in class org.tribuo.data.columnar.RowProcessor
Deprecated.
In a future release this API will change, in the meantime this is the correct way to get a row processor with clean state.

When using regexMappingProcessors, RowProcessor is stateful in a way that can sometimes make it fail the second time it is used. Concretely:

     RowProcessor rp;
     Dataset ds1 = new MutableDataset(new CSVDataSource(csvfile1, rp));
     Dataset ds2 = new MutableDataset(new CSVDataSource(csvfile2, rp)); // this may fail due to state in rp
 
This method returns a RowProcessor with clean state and the same configuration as this row processor.
copy(String) - Method in interface org.tribuo.data.columnar.FieldProcessor
Returns a copy of this FieldProcessor bound to the supplied newFieldName.
copy(String) - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
copy(String) - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
copy(String) - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
copy(String) - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
copy(String) - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
Note: the copy shares the text pipeline with the original.
crossValidation - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
Cross-validate the output metrics.
CSV - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
Simple numeric CSV file.
CSVDataSource<T extends Output<T>> - Class in org.tribuo.data.csv
A DataSource for loading separable data from a text file (e.g., CSV, TSV) and applying FieldProcessors to it.
CSVDataSource(URI, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data.
CSVDataSource(URI, RowProcessor<T>, boolean, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data.
CSVDataSource(URI, RowProcessor<T>, boolean, char, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
CSVDataSource(URI, RowProcessor<T>, boolean, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
CSVDataSource(Path, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data.
CSVDataSource(Path, RowProcessor<T>, boolean, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data.
CSVDataSource(Path, RowProcessor<T>, boolean, char, char) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
CSVDataSource(Path, RowProcessor<T>, boolean, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVDataSource
Creates a CSVDataSource using the specified RowProcessor to process the data, and the supplied separator and quote characters to read the input data file.
CSVDataSource.CSVDataSourceProvenance - Class in org.tribuo.data.csv
Provenance for CSVDataSource.
CSVDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
Deserialization constructor.
CSVIterator - Class in org.tribuo.data.csv
An iterator over a CSV file.
CSVIterator(Reader) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied Reader.
CSVIterator(Reader, char, char) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied Reader.
CSVIterator(Reader, char, char, String[]) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied Reader.
CSVIterator(Reader, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied Reader.
CSVIterator(URI) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied URI.
CSVIterator(URI, char, char) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied URI.
CSVIterator(URI, char, char, String[]) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied URI.
CSVIterator(URI, char, char, List<String>) - Constructor for class org.tribuo.data.csv.CSVIterator
Builds a CSVIterator for the supplied URI.
CSVLoader<T extends Output<T>> - Class in org.tribuo.data.csv
Load a DataSource/Dataset from a CSV file.
CSVLoader(char, char, OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
Creates a CSVLoader using the supplied separator, quote and output factory.
CSVLoader(char, OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
Creates a CSVLoader using the supplied separator and output factory.
CSVLoader(OutputFactory<T>) - Constructor for class org.tribuo.data.csv.CSVLoader
Creates a CSVLoader using the supplied output factory.
CSVLoader.CSVLoaderProvenance - Class in org.tribuo.data.csv
Deprecated.
Deprecated in 4.2 as CSVLoader now returns a CSVDataSource. This provenance is kept so older models can still load correctly.
CSVLoaderProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
Deserialization constructor.
csvQuoteChar - Variable in class org.tribuo.data.DataOptions
Quote character in the CSV file.
csvResponseName - Variable in class org.tribuo.data.DataOptions
Response name in the csv file.
CSVSaver - Class in org.tribuo.data.csv
Saves a Dataset in CSV format suitable for loading by CSVLoader.
CSVSaver() - Constructor for class org.tribuo.data.csv.CSVSaver
Builds a CSV saver using the default separator and quote from CSVIterator.
CSVSaver(char, char) - Constructor for class org.tribuo.data.csv.CSVSaver
Builds a CSV saver using the supplied separator and quote.
currentRow - Variable in class org.tribuo.data.columnar.ColumnarIterator
The current row.

D

data - Variable in class org.tribuo.data.text.TextDataSource
The actual data read out of the text file.
DataOptions - Class in org.tribuo.data
Options for working with training and test data in a CLI.
DataOptions() - Constructor for class org.tribuo.data.DataOptions
 
DataOptions.Delimiter - Enum in org.tribuo.data
The delimiters supported by CSV files in this options object.
DataOptions.InputFormat - Enum in org.tribuo.data
The input formats supported by this options object.
DatasetExplorer - Class in org.tribuo.data
A CLI for exploring a serialised Dataset.
DatasetExplorer() - Constructor for class org.tribuo.data.DatasetExplorer
Constructs a dataset explorer.
DatasetExplorer.DatasetExplorerOptions - Class in org.tribuo.data
Command line options.
DatasetExplorerOptions() - Constructor for class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
 
dataSource - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
Datasource to load from a config file
DateExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and translates it to a LocalDate based on the specified DateTimeFormatter.
DateExtractor(String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
Constructs a date extractor that emits a LocalDate by applying the supplied format to the specified field.
DateExtractor(String, String, String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
Constructs a date extractor that emits a LocalDate by applying the supplied format to the specified field.
DateExtractor(String, String, DateTimeFormatter) - Constructor for class org.tribuo.data.columnar.extractors.DateExtractor
Deprecated.
DateFieldProcessor - Class in org.tribuo.data.columnar.processors.field
Processes a column that contains a date value.
DateFieldProcessor(String, EnumSet<DateFieldProcessor.DateFeatureType>, String) - Constructor for class org.tribuo.data.columnar.processors.field.DateFieldProcessor
Constructs a field processor which parses a date from the specified field name using the supplied format string then extracts date features according to the supplied EnumSet.
DateFieldProcessor(String, EnumSet<DateFieldProcessor.DateFeatureType>, String, String, String) - Constructor for class org.tribuo.data.columnar.processors.field.DateFieldProcessor
Constructs a field processor which parses a date from the specified field name using the supplied format string then extracts date features according to the supplied EnumSet.
DateFieldProcessor.DateFeatureType - Enum in org.tribuo.data.columnar.processors.field
The types of date features which can be extracted.
DAY - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The day.
DAY_OF_QUARTER - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The day of the quarter.
DAY_OF_WEEK - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The day of the week in ISO 8601.
DAY_OF_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The day of the year.
dbConfig - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
Name of the DBConfig to use
DEFAULT_HASH_SEED - Static variable in class org.tribuo.data.text.impl.FeatureHasher
Default value for the hash function seed.
DEFAULT_RESPONSE - Static variable in class org.tribuo.data.csv.CSVSaver
The default response column name.
DEFAULT_VALUE_HASH_SEED - Static variable in class org.tribuo.data.text.impl.FeatureHasher
Default value for the value hash function seed.
delimiter - Variable in class org.tribuo.data.DataOptions
Delimiter
DirectoryFileSource<T extends Output<T>> - Class in org.tribuo.data.text
A data source for a somewhat-common format for text classification datasets: a top level directory that contains a number of subdirectories.
DirectoryFileSource() - Constructor for class org.tribuo.data.text.DirectoryFileSource
for olcut
DirectoryFileSource(Path, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.DirectoryFileSource
Creates a data source that will use the given feature extractor and document preprocessors on the data read from the files in the directories representing classes.
DirectoryFileSource.DirectoryFileSourceProvenance - Class in org.tribuo.data.text
Provenance for DirectoryFileSource.
DirectoryFileSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
Deserialization constructor.
DocumentPreprocessor - Interface in org.tribuo.data.text
An interface for things that can pre-process documents before they are broken into features.
DoubleExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and converts it to a double.
DoubleExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.DoubleExtractor
Extracts a double value from the supplied field name.
DoubleExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.DoubleExtractor
Extracts a double value from the supplied field name.
DoubleFieldProcessor - Class in org.tribuo.data.columnar.processors.field
Processes a column that contains a real value.
DoubleFieldProcessor(String) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
Constructs a field processor which extracts a single double valued feature from the specified field name.
DoubleFieldProcessor(String, boolean) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
Constructs a field processor which extracts a single double valued feature from the specified field name.
DoubleFieldProcessor(String, boolean, boolean) - Constructor for class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
Constructs a field processor which extracts a single double valued feature from the specified field name.

E

EmptyResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
A ResponseProcessor that always emits an empty optional.
EmptyResponseProcessor(OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
Constructs a response processor which never emits a response.
equals(Object) - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
 
equals(Object) - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
 
equals(Object) - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
 
equals(Object) - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
 
equals(Object) - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
 
equals(Object) - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
 
EVEN_OR_ODD_DAY - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The parity of the day of the year.
EVEN_OR_ODD_MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The parity of the month.
EVEN_OR_ODD_WEEK - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The parity of the week of the year as defined by ISO 8601.
EVEN_OR_ODD_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The parity of the year.
expandRegexMapping(Collection<String>) - Method in class org.tribuo.data.columnar.RowProcessor
Uses similar logic to TransformationMap.validateTransformations(org.tribuo.FeatureMap) to check the regexes against the supplied list of field names.
expandRegexMapping(ImmutableFeatureMap) - Method in class org.tribuo.data.columnar.RowProcessor
Uses similar logic to TransformationMap.validateTransformations(org.tribuo.FeatureMap) to check the regexes against the supplied feature map.
expandRegexMapping(Model<T>) - Method in class org.tribuo.data.columnar.RowProcessor
Uses similar logic to TransformationMap.validateTransformations(org.tribuo.FeatureMap) to check the regexes against the ImmutableFeatureMap contained in the supplied Model.
extract(LocalDate) - Method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
Applies this enum's extraction function to the supplied date.
extract(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
 
extract(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
 
extract(ColumnarIterator.Row) - Method in interface org.tribuo.data.columnar.FieldExtractor
Returns Optional which is filled if extraction succeeded.
extract(T, String) - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
 
extract(T, String) - Method in interface org.tribuo.data.text.TextFeatureExtractor
Extracts an example from the supplied input text and output object.
extractField(String) - Method in class org.tribuo.data.columnar.extractors.DateExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.IntExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
 
extractField(String) - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Extracts the field value, or returns Optional.empty() if it failed to parse.
extractor - Variable in class org.tribuo.data.text.DirectoryFileSource
The extractor that we'll use to turn text into examples.
extractor - Variable in class org.tribuo.data.text.TextDataSource
The extractor that we'll use to turn text into examples.
extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
Separates this class's non-configurable fields from the configurable fields.
extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
Separates out the configured and non-configured provenance values.
extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
Splits the provenance into configured and non-configured values.
extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
Separates out the configured and non-configured provenance values.
extractProvenanceInfo(Map<String, Provenance>) - Static method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
Separates out the configured and non-configured provenance values.

F

FEATURE_VALUE - Static variable in class org.tribuo.data.columnar.processors.field.IdentityProcessor
The value of the emitted features.
FeatureAggregator - Interface in org.tribuo.data.text
An interface for aggregating feature values into other values.
FeatureHasher - Class in org.tribuo.data.text.impl
Hashes the feature names to reduce the dimensionality.
FeatureHasher(int) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
Constructs a feature hasher using the supplied hash dimension.
FeatureHasher(int, boolean) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
Constructs a feature hasher using the supplied hash dimension.
FeatureHasher(int, int, int, boolean) - Constructor for class org.tribuo.data.text.impl.FeatureHasher
Constructs a feature hasher using the supplied hash dimension and seed values.
featureInfo(CommandInterpreter, String) - Method in class org.tribuo.data.DatasetExplorer
Shows information on a particular feature.
FeatureProcessor - Interface in org.tribuo.data.columnar
Takes a list of columnar features and adds new features or removes existing features.
FeatureTransformer - Interface in org.tribuo.data.text
A feature transformer maps a list of features to a new list of features.
FIELD_NAME - Static variable in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
The field name this response processor looks for, which is ignored anyway as this processor always returns Optional.empty().
FieldExtractor<T> - Interface in org.tribuo.data.columnar
Extracts a value from a field to be placed in an Example's metadata field.
fieldName - Variable in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
The field name to read.
FieldProcessor - Interface in org.tribuo.data.columnar
An interface for things that process the columns in a data set.
FieldProcessor.GeneratedFeatureType - Enum in org.tribuo.data.columnar
The types of generated features.
fieldProcessorMap - Variable in class org.tribuo.data.columnar.RowProcessor
The map of field processors.
FieldResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
A response processor that returns the value(s) in a given (set of) fields.
FieldResponseProcessor(String, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Constructs a response processor which passes the field value through the output factory.
FieldResponseProcessor(List<String>, String, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Constructs a response processor which passes the field value through the output factory.
FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Constructs a response processor which passes the field value through the output factory.
FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Constructs a response processor which passes the field value through the output factory.
FieldResponseProcessor(List<String>, List<String>, OutputFactory<T>, boolean, boolean) - Constructor for class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Constructs a response processor which passes the field value through the output factory.
fields - Variable in class org.tribuo.data.columnar.ColumnarIterator
The column headers for this iterator.
fileCompleter() - Method in class org.tribuo.data.DatasetExplorer
The filename completer.
FIRST - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Select the first feature value in the list.
FloatExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and converts it to a float.
FloatExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.FloatExtractor
Extracts a float value from the supplied field name.
FloatExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.FloatExtractor
Extracts a float value from the supplied field name.
forEachRemaining(Consumer<? super ColumnarIterator.Row>) - Method in class org.tribuo.data.columnar.ColumnarIterator
 

G

general - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
Data loading options.
generateExample(long, Map<String, String>, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
Generate an Example from the supplied row.
generateExample(Map<String, String>, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
Generate an Example from the supplied row.
generateExample(ColumnarIterator.Row, boolean) - Method in class org.tribuo.data.columnar.RowProcessor
Generate an Example from the supplied row.
generateFeatureName(String, String) - Static method in class org.tribuo.data.columnar.ColumnarFeature
Generates a feature name based on the field name and the name.
generateFeatureName(String, String, String) - Static method in class org.tribuo.data.columnar.ColumnarFeature
Generates a feature name used for conjunction features.
generateFeatures(Map<String, String>) - Method in class org.tribuo.data.columnar.RowProcessor
Generates the features from the supplied row.
generateMetadata(ColumnarIterator.Row) - Method in class org.tribuo.data.columnar.RowProcessor
Generates the example metadata from the supplied row and index.
getClassName() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
 
getColumnEntry() - Method in class org.tribuo.data.columnar.ColumnarFeature
Gets the columnEntry (i.e., the feature name produced by the FieldExtractor without the fieldName).
getColumnNames() - Method in class org.tribuo.data.columnar.RowProcessor
The set of column names this will use for the feature processing.
getConnection() - Method in class org.tribuo.data.sql.SQLDBConfig
Constructs a connection based on the object fields.
getDescription() - Method in class org.tribuo.data.columnar.RowProcessor
Returns a description of the row processor and it's fields.
getDescription() - Method in class org.tribuo.data.DatasetExplorer
 
getFeatureProcessors() - Method in class org.tribuo.data.columnar.RowProcessor
Returns the set of FeatureProcessors this RowProcessor uses.
getFeatureType() - Method in interface org.tribuo.data.columnar.FieldProcessor
Returns the feature type this FieldProcessor generates.
getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
getFeatureType() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
Gets the field name.
getFieldName() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Gets the field name this extractor operates on.
getFieldName() - Method in interface org.tribuo.data.columnar.FieldProcessor
Gets the field name this FieldProcessor uses.
getFieldName() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
 
getFieldName() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Deprecated.
getFieldName() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
Deprecated.
getFieldName() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Deprecated.
getFieldName() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
Deprecated.
getFieldName() - Method in interface org.tribuo.data.columnar.ResponseProcessor
Deprecated.
use ResponseProcessor.getFieldNames() and support multiple values instead. Gets the field name this ResponseProcessor uses.
getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
 
getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
getFieldNames() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
getFieldNames() - Method in interface org.tribuo.data.columnar.ResponseProcessor
Gets the field names this ResponseProcessor uses.
getFieldProcessor(String) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Retrieves, if present, the fieldProcessor with the given name
getFieldProcessors() - Method in class org.tribuo.data.columnar.RowProcessor
Returns the map of FieldProcessors this RowProcessor uses.
getFields() - Method in class org.tribuo.data.columnar.ColumnarIterator
The immutable list of field names.
getFields() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
Gets the field headers.
getFirstFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
If it's a conjunction feature, return the first field name.
getIndex() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
Gets the row index.
getInstanceValues() - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
 
getInstanceValues() - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
 
getInstanceValues() - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
 
getInstanceValues() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
 
getInstanceValues() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
 
getLowerMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
Returns the lower quartile value.
getMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
Returns the median value.
getMetadataName() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
 
getMetadataName() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Gets the metadata key name.
getMetadataName() - Method in interface org.tribuo.data.columnar.FieldExtractor
Gets the metadata key name.
getMetadataTypes() - Method in class org.tribuo.data.columnar.ColumnarDataSource
Returns the metadata keys and value types that are created by this DataSource.
getMetadataTypes() - Method in class org.tribuo.data.columnar.RowProcessor
Returns the metadata keys and value types that are extracted by this RowProcessor.
getName() - Method in class org.tribuo.data.DatasetExplorer
 
getNumNamespaces() - Method in interface org.tribuo.data.columnar.FieldProcessor
Binarised categoricals can be namespaced, where the field name is appended with "#<non-negative-int>" to denote the namespace.
getOptionsDescription() - Method in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
 
getOptionsDescription() - Method in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
 
getOptionsDescription() - Method in class org.tribuo.data.DataOptions
 
getOptionsDescription() - Method in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
 
getOutputFactory() - Method in class org.tribuo.data.columnar.ColumnarDataSource
 
getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
 
getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
getOutputFactory() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
getOutputFactory() - Method in interface org.tribuo.data.columnar.ResponseProcessor
Gets the OutputFactory this ResponseProcessor uses.
getOutputFactory() - Method in class org.tribuo.data.text.DirectoryFileSource
 
getOutputFactory() - Method in class org.tribuo.data.text.TextDataSource
Returns the output factory used to convert the text input into an Output.
getProvenance() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.IntExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.feature.UniqueProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.response.Quartile
 
getProvenance() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
getProvenance() - Method in class org.tribuo.data.columnar.RowProcessor
 
getProvenance() - Method in class org.tribuo.data.csv.CSVDataSource
 
getProvenance() - Method in class org.tribuo.data.sql.SQLDataSource
 
getProvenance() - Method in class org.tribuo.data.sql.SQLDBConfig
 
getProvenance() - Method in class org.tribuo.data.text.DirectoryFileSource
 
getProvenance() - Method in class org.tribuo.data.text.impl.AverageAggregator
 
getProvenance() - Method in class org.tribuo.data.text.impl.BasicPipeline
 
getProvenance() - Method in class org.tribuo.data.text.impl.CasingPreprocessor
 
getProvenance() - Method in class org.tribuo.data.text.impl.FeatureHasher
 
getProvenance() - Method in class org.tribuo.data.text.impl.NewsPreprocessor
 
getProvenance() - Method in class org.tribuo.data.text.impl.NgramProcessor
 
getProvenance() - Method in class org.tribuo.data.text.impl.RegexPreprocessor
 
getProvenance() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
 
getProvenance() - Method in class org.tribuo.data.text.impl.SumAggregator
 
getProvenance() - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
 
getProvenance() - Method in class org.tribuo.data.text.impl.TokenPipeline
 
getProvenance() - Method in class org.tribuo.data.text.impl.UniqueAggregator
 
getRegexFieldProcessor(String) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
Retrieves, if present, the regexFieldProcessor with the given regex
getResponseProcessor() - Method in class org.tribuo.data.columnar.RowProcessor
Returns the response processor this RowProcessor uses.
getRow() - Method in class org.tribuo.data.columnar.ColumnarIterator
Returns the next row of data based on internal state stored by the implementor, or Optional.empty() if there is no more data.
getRow() - Method in class org.tribuo.data.csv.CSVIterator
 
getRow() - Method in class org.tribuo.data.sql.ResultSetIterator
 
getRowData() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
Gets the row data.
getSecondFieldName() - Method in class org.tribuo.data.columnar.ColumnarFeature
If it's a conjunction feature, return the second field name.
getStatement() - Method in class org.tribuo.data.sql.SQLDBConfig
Constructs a statement based on the object fields.
getUpperMedian() - Method in class org.tribuo.data.columnar.processors.response.Quartile
The upper quartile value.
getValueType() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.DoubleExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.FloatExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.IdentityExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.IndexExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.IntExtractor
 
getValueType() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
 
getValueType() - Method in interface org.tribuo.data.columnar.FieldExtractor
Gets the class of the value produced by this extractor.
GROUPS - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
Triggers feature generation for each matching group in the string.

H

handleDoc(String) - Method in class org.tribuo.data.text.TextDataSource
A method that can be overridden to do different things to each document that we've read.
hashCode() - Method in class org.tribuo.data.csv.CSVDataSource.CSVDataSourceProvenance
 
hashCode() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
 
hashCode() - Method in class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
 
hashCode() - Method in class org.tribuo.data.text.DirectoryFileSource.DirectoryFileSourceProvenance
 
hashCode() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
 
hashCode() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
 
hashDim - Variable in class org.tribuo.data.DataOptions
Hashing dimension used for standard text format.
hasNext() - Method in class org.tribuo.data.columnar.ColumnarIterator
 

I

IdentityExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and emits it as a String.
IdentityExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IdentityExtractor
Extracts the String value from the supplied field.
IdentityExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.IdentityExtractor
Extracts the String value from the supplied field.
IdentityProcessor - Class in org.tribuo.data.columnar.processors.field
A FieldProcessor which converts the field name and value into a feature with a value of IdentityProcessor.FEATURE_VALUE.
IdentityProcessor(String) - Constructor for class org.tribuo.data.columnar.processors.field.IdentityProcessor
Constructs a field processor which emits a single feature with a specific value and uses the field name and field value as the feature name.
IndexExtractor - Class in org.tribuo.data.columnar.extractors
An Extractor with special casing for loading the index from a Row.
IndexExtractor() - Constructor for class org.tribuo.data.columnar.extractors.IndexExtractor
Extracts the index writing to the default metadata field name Example.NAME.
IndexExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IndexExtractor
Extracts the index, writing to the supplied metadata field name.
inputFormat - Variable in class org.tribuo.data.DataOptions
Loads the data using the specified format.
inputPath - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
SQL File to run as a query, defaults to stdin
inputPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
Input data file in standard text format.
INTEGER - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Ordered integral feature values (e.g.
IntExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and converts it to a int.
IntExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.IntExtractor
Extracts a int value from the supplied field name.
IntExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.IntExtractor
Extracts a int value from the supplied field name.
isConfigured() - Method in class org.tribuo.data.columnar.RowProcessor
Returns true if the regexes have been expanded into field processors.
iterator() - Method in class org.tribuo.data.columnar.ColumnarDataSource
 
iterator() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
 
iterator() - Method in class org.tribuo.data.text.DirectoryFileSource
 
iterator() - Method in class org.tribuo.data.text.TextDataSource
 

J

JOINER - Static variable in class org.tribuo.data.columnar.ColumnarFeature
The joiner between the field name and feature name.

L

LAST - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Select the last feature value in the list.
LIBSVM - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
LibSVM/svm-light format data.
load(Path, String) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv file then wraps it in a dataset.
load(Path, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv file then wraps it in a dataset.
load(Path, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv file then wraps it in a dataset.
load(Path, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv file then wraps it in a dataset.
load(OutputFactory<T>) - Method in class org.tribuo.data.DataOptions
Loads the training and testing data from DataOptions.trainingPath and DataOptions.testingPath according to the other parameters specified in this class.
loadDataset(CommandInterpreter, File, boolean) - Method in class org.tribuo.data.DatasetExplorer
Loads a serialized dataset.
loadDataSource(URL, String) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(URL, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(URL, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(URL, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(Path, String) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(Path, String, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(Path, Set<String>) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
loadDataSource(Path, Set<String>, String[]) - Method in class org.tribuo.data.csv.CSVLoader
Loads a DataSource from the specified csv path.
LOWERCASE - Enum constant in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
Lowercase the input text.

M

main(String[]) - Static method in class org.tribuo.data.CompletelyConfigurableTrainTest
 
main(String[]) - Static method in class org.tribuo.data.ConfigurableTrainTest
 
main(String[]) - Static method in class org.tribuo.data.DatasetExplorer
Runs a dataset explorer.
main(String[]) - Static method in class org.tribuo.data.PreprocessAndSerialize
Run the PreprocessAndSerialize CLI.
main(String[]) - Static method in class org.tribuo.data.sql.SQLToCSV
Reads an SQL query from the standard input and writes the results of the query to the standard output.
main(String[]) - Static method in class org.tribuo.data.text.SplitTextData
Runs the SplitTextData CLI.
map(String, List<Feature>) - Method in interface org.tribuo.data.text.FeatureTransformer
Transforms features into a new list of features
map(String, List<Feature>) - Method in class org.tribuo.data.text.impl.FeatureHasher
 
MATCH_ALL - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
Triggers feature generation if the whole string matches.
MATCH_CONTAINS - Enum constant in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
Triggers feature generation if the string contains a match.
MAX - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Select the maximum feature value in the list.
metadataName - Variable in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
The metadata key to emit.
MIN - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Select the minimum feature value in the list.
minCount - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Remove features which occur fewer than m times.
minCount - Variable in class org.tribuo.data.DataOptions
Minimum cardinality of the features.
minCount(CommandInterpreter, int) - Method in class org.tribuo.data.DatasetExplorer
Shows the number of features which occurred more than minCount times in the dataset.
modelFilename - Variable in class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
Dataset file to load.
modelOutputProtobuf - Variable in class org.tribuo.data.DataOptions
Write the model out as a protobuf.
MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The month.

N

NAMESPACE - Static variable in interface org.tribuo.data.columnar.FieldProcessor
The namespacing separator.
NEGATIVE_NAME - Static variable in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
The default negative response.
NewsPreprocessor - Class in org.tribuo.data.text.impl
A document pre-processor for 20 newsgroup data.
NewsPreprocessor() - Constructor for class org.tribuo.data.text.impl.NewsPreprocessor
Constructor.
next() - Method in class org.tribuo.data.columnar.ColumnarIterator
 
ngram - Variable in class org.tribuo.data.DataOptions
Ngram size to generate when using standard text format.
NgramProcessor - Class in org.tribuo.data.text.impl
A text processor that will generate token ngrams of a particular size.
NgramProcessor(Tokenizer, int, double) - Constructor for class org.tribuo.data.text.impl.NgramProcessor
Creates a processor that will generate token ngrams of size n.
numExamples(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
Shows the number of examples in this dataset.
numFeatures(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
Shows the number of features in this dataset.
numFolds - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
The number of cross validation folds.

O

OffsetDateTimeExtractor - Class in org.tribuo.data.columnar.extractors
Extracts the field value and translates it to an OffsetDateTime based on the specified DateTimeFormatter.
OffsetDateTimeExtractor(String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
Constructs a date time extractor that emits an OffsetDateTime by applying the supplied format to the specified field.
OffsetDateTimeExtractor(String, String, String, String, String) - Constructor for class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
Constructs a date time extractor that emits an OffsetDateTime by applying the supplied format to the specified field.
org.tribuo.data - package org.tribuo.data
Provides classes for loading in data from disk, processing it into examples, and splitting datasets for things like cross-validation and train-test splits.
org.tribuo.data.columnar - package org.tribuo.data.columnar
Provides classes for processing columnar data and generating Examples.
org.tribuo.data.columnar.extractors - package org.tribuo.data.columnar.extractors
Provides implementations of FieldExtractor.
org.tribuo.data.columnar.processors.feature - package org.tribuo.data.columnar.processors.feature
Provides implementations of FeatureProcessor.
org.tribuo.data.columnar.processors.field - package org.tribuo.data.columnar.processors.field
Provides implementations of FieldProcessor.
org.tribuo.data.columnar.processors.response - package org.tribuo.data.columnar.processors.response
Provides implementations of ResponseProcessor.
org.tribuo.data.csv - package org.tribuo.data.csv
Provides classes which can load columnar data (using a RowProcessor) from a CSV (or other character delimited format) file.
org.tribuo.data.sql - package org.tribuo.data.sql
Provides classes which can load columnar data (using a RowProcessor) from a SQL source.
org.tribuo.data.text - package org.tribuo.data.text
Provides interfaces for converting text inputs into Features and Examples.
org.tribuo.data.text.impl - package org.tribuo.data.text.impl
Provides implementations of text data processors.
output - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
path to serialize the dataset
outputFactory - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
The output factory to construct.
outputFactory - Variable in class org.tribuo.data.text.DirectoryFileSource
The factory that converts a String into an Output.
outputFactory - Variable in class org.tribuo.data.text.TextDataSource
The factory that converts a String into an Output.
outputInfo(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
Shows the output information.
outputPath - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Path to serialize model to.
outputPath - Variable in class org.tribuo.data.DataOptions
Path to serialize model to.
outputPath - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
File to write query results as CSV, defaults to stdout
outputRequired - Variable in class org.tribuo.data.columnar.ColumnarDataSource
Is an output required from each row?

P

parseLine(String, int) - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
Parses a line in Tribuo's default text format.
partialExpandRegexMapping(Collection<String>) - Method in class org.tribuo.data.columnar.RowProcessor
Caveat Implementor! This method contains the logic of RowProcessor.expandRegexMapping(org.tribuo.Model<T>) without any of the checks that ensure the RowProcessor is in a valid state.
password - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
Password for the SQL database
path - Variable in class org.tribuo.data.text.TextDataSource
The path that data was read from.
POSITIVE_NAME - Static variable in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
The default positive response.
postConfig() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
postConfig() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
postConfig() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
postConfig() - Method in class org.tribuo.data.columnar.RowProcessor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.csv.CSVDataSource
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.sql.SQLDBConfig
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.BasicPipeline
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.FeatureHasher
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.NgramProcessor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.RegexPreprocessor
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
Used by the OLCUT configuration system, and should not be called by external code.
postConfig() - Method in class org.tribuo.data.text.impl.TokenPipeline
Used by the OLCUT configuration system, and should not be called by external code.
PreprocessAndSerialize - Class in org.tribuo.data
Reads in a Datasource, processes all the data, and writes it out as a serialized dataset.
PreprocessAndSerialize.PreprocessAndSerializeOptions - Class in org.tribuo.data
Command line options.
PreprocessAndSerializeOptions() - Constructor for class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
 
preprocessors - Variable in class org.tribuo.data.text.DirectoryFileSource
Document preprocessors that should be run on the documents that make up this data set.
preprocessors - Variable in class org.tribuo.data.text.TextDataSource
Document preprocessors that should be run on the documents that make up this data set.
process(String) - Method in interface org.tribuo.data.columnar.FieldProcessor
Processes the field value and generates a (possibly empty) list of ColumnarFeatures.
process(String) - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
process(String) - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
process(String) - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
process(String) - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
process(String) - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
 
process(String) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Deprecated.
process(String) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
Deprecated.
process(String) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Deprecated.
process(String) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
Deprecated.
process(String) - Method in interface org.tribuo.data.columnar.ResponseProcessor
Deprecated.
use ResponseProcessor.process(List) and support multiple values instead. Returns Optional.empty() if it failed to process out a response.
process(String) - Method in class org.tribuo.data.text.impl.NgramProcessor
 
process(String) - Method in interface org.tribuo.data.text.TextProcessor
Extracts features from the supplied text.
process(String, String) - Method in class org.tribuo.data.text.impl.BasicPipeline
 
process(String, String) - Method in class org.tribuo.data.text.impl.NgramProcessor
 
process(String, String) - Method in class org.tribuo.data.text.impl.TokenPipeline
 
process(String, String) - Method in interface org.tribuo.data.text.TextPipeline
Extracts a list of features from the supplied text, using the tag to prepend the feature names.
process(String, String) - Method in interface org.tribuo.data.text.TextProcessor
Extracts features from the supplied text.
process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
This method always returns Optional.empty().
process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
process(List<String>) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
process(List<String>) - Method in interface org.tribuo.data.columnar.ResponseProcessor
Returns Optional.empty() if it failed to process out a response.This method has a default implementation for backwards compatibility with Tribuo 4.0 and 4.1.
process(List<ColumnarFeature>) - Method in interface org.tribuo.data.columnar.FeatureProcessor
Processes a list of ColumnarFeatures, transforming it by adding conjunctions or removing unnecessary features.
process(List<ColumnarFeature>) - Method in class org.tribuo.data.columnar.processors.feature.UniqueProcessor
 
processDoc(String) - Method in interface org.tribuo.data.text.DocumentPreprocessor
Processes the content of part of a document stored as a string, returning a new string.
processDoc(String) - Method in class org.tribuo.data.text.impl.CasingPreprocessor
 
processDoc(String) - Method in class org.tribuo.data.text.impl.NewsPreprocessor
 
processDoc(String) - Method in class org.tribuo.data.text.impl.RegexPreprocessor
 
protobufFormat - Variable in class org.tribuo.data.DatasetExplorer.DatasetExplorerOptions
Load the model from a protobuf.
protobufFormat - Variable in class org.tribuo.data.PreprocessAndSerialize.PreprocessAndSerializeOptions
Save the dataset as a protobuf.
provenance - Variable in class org.tribuo.data.text.impl.SimpleTextDataSource
The data source provenance.

Q

Quartile - Class in org.tribuo.data.columnar.processors.response
A quartile to split data into 4 chunks.
Quartile(double, double, double) - Constructor for class org.tribuo.data.columnar.processors.response.Quartile
Constructs a quartile with the specified values.
QuartileResponseProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar.processors.response
Processes the response into quartiles and emits them as classification outputs.
QuartileResponseProcessor(String, String, Quartile, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
Constructs a response processor which emits 4 distinct bins for the output factory to process.
QuartileResponseProcessor(List<String>, List<Quartile>, OutputFactory<T>) - Constructor for class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
Constructs a response processor which emits 4 distinct bins for the output factory to process.
QUOTE - Static variable in class org.tribuo.data.csv.CSVIterator
Default quote character.

R

rawLines - Variable in class org.tribuo.data.text.impl.SimpleStringDataSource
Used because OLCUT doesn't support generic Iterables.
read() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
 
read() - Method in class org.tribuo.data.text.impl.SimpleTextDataSource
 
read() - Method in class org.tribuo.data.text.TextDataSource
Reads the data from the Path.
REAL - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Real valued features.
RegexFieldProcessor - Class in org.tribuo.data.columnar.processors.field
A FieldProcessor which applies a regex to a field and generates ColumnarFeatures based on the matches.
RegexFieldProcessor(String, String, EnumSet<RegexFieldProcessor.Mode>) - Constructor for class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
Constructs a field processor which emits features when the field value matches the supplied regex.
RegexFieldProcessor(String, Pattern, EnumSet<RegexFieldProcessor.Mode>) - Constructor for class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
Constructs a field processor which emits features when the field value matches the supplied regex.
RegexFieldProcessor.Mode - Enum in org.tribuo.data.columnar.processors.field
Matching mode.
regexMappingProcessors - Variable in class org.tribuo.data.columnar.RowProcessor
The map of regexes to field processors.
RegexPreprocessor - Class in org.tribuo.data.text.impl
A simple document preprocessor which applies regular expressions to the input.
RegexPreprocessor(List<String>, List<String>) - Constructor for class org.tribuo.data.text.impl.RegexPreprocessor
Construct a regex preprocessor.
replaceNewlinesWithSpaces - Variable in class org.tribuo.data.columnar.RowProcessor
Should newlines be replaced with spaces before processing.
responseProcessor - Variable in class org.tribuo.data.columnar.RowProcessor
The processor which extracts the response.
ResponseProcessor<T extends Output<T>> - Interface in org.tribuo.data.columnar
An interface that will take the response field and produce an Output.
ResultSetIterator - Class in org.tribuo.data.sql
An iterator over a ResultSet returned from JDBC.
ResultSetIterator(ResultSet) - Constructor for class org.tribuo.data.sql.ResultSetIterator
Construct a result set iterator over the supplied result set.
ResultSetIterator(ResultSet, int) - Constructor for class org.tribuo.data.sql.ResultSetIterator
Constructs a result set iterator over the supplied result set using the specified fetch buffer size.
Row(long, List<String>, Map<String, String>) - Constructor for class org.tribuo.data.columnar.ColumnarIterator.Row
Constructs a row from a columnar source.
rowIterator() - Method in class org.tribuo.data.columnar.ColumnarDataSource
The iterator that emits ColumnarIterator.Row objects from the underlying data source.
rowIterator() - Method in class org.tribuo.data.csv.CSVDataSource
 
rowIterator() - Method in class org.tribuo.data.sql.SQLDataSource
 
rowProcessor - Variable in class org.tribuo.data.columnar.ColumnarDataSource
The RowProcessor to use.
rowProcessor - Variable in class org.tribuo.data.DataOptions
The name of the row processor from the config file.
RowProcessor<T extends Output<T>> - Class in org.tribuo.data.columnar
A processor which takes a Map of String to String and returns an Example.
RowProcessor() - Constructor for class org.tribuo.data.columnar.RowProcessor
For olcut.
RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
Deprecated.
Prefer RowProcessor.Builder to many-argument constructors
RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Map<String, FieldProcessor>, Set<FeatureProcessor>, boolean) - Constructor for class org.tribuo.data.columnar.RowProcessor
Deprecated.
Prefer RowProcessor.Builder to many-argument constructors
RowProcessor(List<FieldExtractor<?>>, FieldExtractor<Float>, ResponseProcessor<T>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
Deprecated.
Prefer RowProcessor.Builder to many-argument constructors
RowProcessor(List<FieldExtractor<?>>, ResponseProcessor<T>, Map<String, FieldProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
Deprecated.
Prefer RowProcessor.Builder to many-argument constructors
RowProcessor(ResponseProcessor<T>, Map<String, FieldProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
Constructs a RowProcessor using the supplied responseProcessor to extract the response variable, and the supplied fieldProcessorMap to control which fields are parsed and how they are parsed.
RowProcessor(ResponseProcessor<T>, Map<String, FieldProcessor>, Set<FeatureProcessor>) - Constructor for class org.tribuo.data.columnar.RowProcessor
Constructs a RowProcessor using the supplied responseProcessor to extract the response variable, and the supplied fieldProcessorMap to control which fields are parsed and how they are parsed.
RowProcessor.Builder<T extends Output<T>> - Class in org.tribuo.data.columnar
Builder for RowProcessor.

S

save(Path, Dataset<T>, String) - Method in class org.tribuo.data.csv.CSVSaver
Saves the dataset to the specified path.
save(Path, Dataset<T>, Set<String>) - Method in class org.tribuo.data.csv.CSVSaver
Saves the dataset to the specified path.
saveCSV(CommandInterpreter, String) - Method in class org.tribuo.data.DatasetExplorer
Saves out the dataset as a CSV file.
saveModel(Model<T>) - Method in class org.tribuo.data.DataOptions
Saves the model out to the path in DataOptions.outputPath.
scaleFeatures - Variable in class org.tribuo.data.DataOptions
Scales the features to the range 0-1 independently.
scaleIncZeros - Variable in class org.tribuo.data.DataOptions
Includes implicit zeros in the scale range calculation.
seed - Variable in class org.tribuo.data.DataOptions
RNG seed.
seed - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
Seed for the RNG.
SEMICOLON - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
Semicolon separator.
SEPARATOR - Static variable in class org.tribuo.data.csv.CSVIterator
Default separator character.
SERIALIZED - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
Serialized Tribuo datasets.
SERIALIZED_PROTOBUF - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
Protobuf serialized Tribuo datasets.
setFeatureProcessors(Set<FeatureProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
The FeatureProcessors to apply to each extracted feature list.
setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
Deprecated.
setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
Deprecated.
setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
Deprecated.
setFieldName(String) - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
Deprecated.
setFieldName(String) - Method in interface org.tribuo.data.columnar.ResponseProcessor
Deprecated.
Response processors should be immutable; downstream objects assume that they are Set the field name this ResponseProcessor uses.
setFieldProcessors(Iterable<FieldProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
the FieldProcessors to apply to each row.
setMetadataExtractors(List<FieldExtractor<?>>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
If set, the supplied FieldExtractors will be run for each example, populating Example.getMetadata().
setRegexMappingProcessors(Map<String, FieldProcessor>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
A map from strings (interpreted as regular expressions by Pattern.compile(String)) to FieldProcessors such that if a field name matches a regular expression, the corresponding FieldProcessor is used to process it.
setReplaceNewLinesWithSpaces(boolean) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
If true, replaces newlines in fields with spaces before passing them to FieldProcessors.
setWeightExtractor(FieldExtractor<Float>) - Method in class org.tribuo.data.columnar.RowProcessor.Builder
If set, the constructed RowProcessor will add the extracted floats into the Example.setWeight(float)s.
showOutputStats(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
Shows the output statistics.
showProvenance(CommandInterpreter) - Method in class org.tribuo.data.DatasetExplorer
Shows the dataset provenance.
SimpleFieldExtractor<T> - Class in org.tribuo.data.columnar.extractors
Extracts a value from a single field to be placed in an Example's metadata field.
SimpleFieldExtractor() - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
For olcut.
SimpleFieldExtractor(String) - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Constructs a simple field extractor which reads from the supplied field name and writes out to a metadata field with the same name.
SimpleFieldExtractor(String, String) - Constructor for class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
Constructs a simple field extractor with the supplied field name and metadata field name.
SimpleStringDataSource<T extends Output<T>> - Class in org.tribuo.data.text.impl
A version of SimpleTextDataSource that accepts a List of Strings.
SimpleStringDataSource(List<String>, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleStringDataSource
Constructs a simple string data source from the supplied lines.
SimpleStringDataSource.SimpleStringDataSourceProvenance - Class in org.tribuo.data.text.impl
Provenance for SimpleStringDataSource.
SimpleStringDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.impl.SimpleStringDataSource.SimpleStringDataSourceProvenance
Deserialization constructor.
SimpleTextDataSource<T extends Output<T>> - Class in org.tribuo.data.text.impl
A dataset for a simple data format for text classification experiments.
SimpleTextDataSource() - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
for olcut
SimpleTextDataSource(File, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
Constructs a simple text data source by reading lines from the supplied file.
SimpleTextDataSource(Path, OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
Constructs a simple text data source by reading lines from the supplied path.
SimpleTextDataSource(OutputFactory<T>, TextFeatureExtractor<T>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource
Cosntructs a data source without a path.
SimpleTextDataSource.SimpleTextDataSourceProvenance - Class in org.tribuo.data.text.impl
Provenance for SimpleTextDataSource.
SimpleTextDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.text.impl.SimpleTextDataSource.SimpleTextDataSourceProvenance
Deserialization constructor.
splitFraction - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
Split fraction.
SplitTextData - Class in org.tribuo.data.text
Splits data in our standard text format into training and testing portions.
SplitTextData() - Constructor for class org.tribuo.data.text.SplitTextData
 
SplitTextData.TrainTestSplitOptions - Class in org.tribuo.data.text
Command line options.
SQLDataSource<T extends Output<T>> - Class in org.tribuo.data.sql
A DataSource for loading columnar data from a database and applying FieldProcessors to it.
SQLDataSource(String, SQLDBConfig, OutputFactory<T>, RowProcessor<T>, boolean) - Constructor for class org.tribuo.data.sql.SQLDataSource
Constructs a SQLDataSource.
SQLDataSource.SQLDataSourceProvenance - Class in org.tribuo.data.sql
Provenance for SQLDataSource.
SQLDataSourceProvenance(Map<String, Provenance>) - Constructor for class org.tribuo.data.sql.SQLDataSource.SQLDataSourceProvenance
Deserialization constructor.
SQLDBConfig - Class in org.tribuo.data.sql
N.B.
SQLDBConfig(String, String, String, String, String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
Constructs a SQL database configuration.
SQLDBConfig(String, String, String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
Constructs a SQL database configuration.
SQLDBConfig(String, Map<String, String>) - Constructor for class org.tribuo.data.sql.SQLDBConfig
Constructs a SQL database configuration.
SQLToCSV - Class in org.tribuo.data.sql
Read an SQL query in on the standard input, write a CSV file containing the results to the standard output.
SQLToCSV() - Constructor for class org.tribuo.data.sql.SQLToCSV
 
SQLToCSV.SQLToCSVOptions - Class in org.tribuo.data.sql
Command line options.
SQLToCSVOptions() - Constructor for class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
 
startShell() - Method in class org.tribuo.data.DatasetExplorer
Start the command shell
SUM - Enum constant in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Add together all the feature values.
SumAggregator - Class in org.tribuo.data.text.impl
A feature aggregator that aggregates occurrence counts across a number of feature lists.
SumAggregator() - Constructor for class org.tribuo.data.text.impl.SumAggregator
 

T

TAB - Enum constant in enum org.tribuo.data.DataOptions.Delimiter
Tab separator.
termCounting - Variable in class org.tribuo.data.DataOptions
Use term counts instead of boolean when using the standard text format.
testingPath - Variable in class org.tribuo.data.DataOptions
Path to the testing file.
testSource - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Load the testing DataSource from the config file.
TEXT - Enum constant in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Text features.
TEXT - Enum constant in enum org.tribuo.data.DataOptions.InputFormat
Text data in Tribuo's standard format (i.e., each line is "output ## text data").
TextDataSource<T extends Output<T>> - Class in org.tribuo.data.text
A base class for textual data sets.
TextDataSource() - Constructor for class org.tribuo.data.text.TextDataSource
for olcut
TextDataSource(File, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.TextDataSource
Creates a text data set by reading it from a file.
TextDataSource(Path, OutputFactory<T>, TextFeatureExtractor<T>, DocumentPreprocessor...) - Constructor for class org.tribuo.data.text.TextDataSource
Creates a text data set by reading it from a path.
TextFeatureExtractor<T extends Output<T>> - Interface in org.tribuo.data.text
An interface for things that take text and turn them into examples that we can use to train or evaluate a classifier.
TextFeatureExtractorImpl<T extends Output<T>> - Class in org.tribuo.data.text.impl
An implementation of TextFeatureExtractor that takes a TextPipeline and generates ArrayExample.
TextFeatureExtractorImpl(TextPipeline) - Constructor for class org.tribuo.data.text.impl.TextFeatureExtractorImpl
Constructs a text feature extractor wrapping the supplied text pipeline.
TextFieldProcessor - Class in org.tribuo.data.columnar.processors.field
A FieldProcessor which takes a text field and runs a TextPipeline on it to generate features.
TextFieldProcessor(String, TextPipeline) - Constructor for class org.tribuo.data.columnar.processors.field.TextFieldProcessor
Constructs a field processor which uses the supplied text pipeline to process the field value.
TextPipeline - Interface in org.tribuo.data.text
A pipeline that takes a String and returns a List of Features.
TextProcessingException - Exception in org.tribuo.data.text
An exception thrown by the text processing system.
TextProcessingException(String) - Constructor for exception org.tribuo.data.text.TextProcessingException
Creates a TextProcessingException with the specified message.
TextProcessingException(String, Throwable) - Constructor for exception org.tribuo.data.text.TextProcessingException
Creates a TextProcessingException wrapping the supplied throwable with the specified message.
TextProcessingException(Throwable) - Constructor for exception org.tribuo.data.text.TextProcessingException
Creates a TextProcessingException wrapping the supplied throwable.
TextProcessor - Interface in org.tribuo.data.text
A TextProcessor takes some text and optionally a feature tag and generates a list of Features from that text.
TokenPipeline - Class in org.tribuo.data.text.impl
A pipeline for generating ngram features.
TokenPipeline(Tokenizer, int, boolean) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
Creates a new token pipeline.
TokenPipeline(Tokenizer, int, boolean, int) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
Creates a new token pipeline.
TokenPipeline(Tokenizer, int, boolean, int, boolean) - Constructor for class org.tribuo.data.text.impl.TokenPipeline
Creates a new token pipeline.
toString() - Method in class org.tribuo.data.columnar.ColumnarIterator.Row
 
toString() - Method in class org.tribuo.data.columnar.extractors.DateExtractor
 
toString() - Method in class org.tribuo.data.columnar.extractors.OffsetDateTimeExtractor
 
toString() - Method in class org.tribuo.data.columnar.extractors.SimpleFieldExtractor
 
toString() - Method in class org.tribuo.data.columnar.processors.field.DateFieldProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.field.DoubleFieldProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.field.IdentityProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.field.RegexFieldProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.response.BinaryResponseProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.response.EmptyResponseProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.response.FieldResponseProcessor
 
toString() - Method in class org.tribuo.data.columnar.processors.response.Quartile
 
toString() - Method in class org.tribuo.data.columnar.processors.response.QuartileResponseProcessor
 
toString() - Method in class org.tribuo.data.columnar.RowProcessor
 
toString() - Method in class org.tribuo.data.csv.CSVDataSource
 
toString() - Method in class org.tribuo.data.csv.CSVLoader.CSVLoaderProvenance
Deprecated.
 
toString() - Method in class org.tribuo.data.sql.SQLDataSource
 
toString() - Method in class org.tribuo.data.sql.SQLDBConfig
 
toString() - Method in class org.tribuo.data.text.DirectoryFileSource
 
toString() - Method in class org.tribuo.data.text.impl.BasicPipeline
 
toString() - Method in class org.tribuo.data.text.impl.SimpleStringDataSource
 
toString() - Method in class org.tribuo.data.text.impl.TextFeatureExtractorImpl
 
toString() - Method in class org.tribuo.data.text.impl.TokenPipeline
 
toString() - Method in class org.tribuo.data.text.TextDataSource
 
trainer - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Load a trainer from the config file.
trainer - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
Load a trainer from the config file.
trainingPath - Variable in class org.tribuo.data.DataOptions
Path to the training file.
trainPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
Output training data file.
trainSource - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Load the training DataSource from the config file.
TrainTestSplitOptions() - Constructor for class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
 
transformationMap - Variable in class org.tribuo.data.CompletelyConfigurableTrainTest.ConfigurableTrainTestOptions
Load a transformation map from the config file.
transformationMap - Variable in class org.tribuo.data.ConfigurableTrainTest.ConfigurableTrainTestOptions
Load a transformation map from the config file.
tryAdvance(Consumer<? super ColumnarIterator.Row>) - Method in class org.tribuo.data.columnar.ColumnarIterator
 

U

UniqueAggregator - Class in org.tribuo.data.text.impl
Aggregates feature tokens, generating unique features.
UniqueAggregator() - Constructor for class org.tribuo.data.text.impl.UniqueAggregator
Constructs an aggregator that replaces all features with the same name with a single feature with the last observed value of that feature.
UniqueAggregator(double) - Constructor for class org.tribuo.data.text.impl.UniqueAggregator
Constructs an aggregator that replaces all features with the same name with a single feature with the specified value.
UniqueProcessor - Class in org.tribuo.data.columnar.processors.feature
Processes a feature list, aggregating all the feature values with the same name.
UniqueProcessor(UniqueProcessor.UniqueType) - Constructor for class org.tribuo.data.columnar.processors.feature.UniqueProcessor
Creates a UniqueProcessor using the specified reduction operation.
UniqueProcessor.UniqueType - Enum in org.tribuo.data.columnar.processors.feature
The type of reduction operation to perform.
UPPERCASE - Enum constant in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
Uppercase the input text.
username - Variable in class org.tribuo.data.sql.SQLToCSV.SQLToCSVOptions
Username for the SQL database

V

validationPath - Variable in class org.tribuo.data.text.SplitTextData.TrainTestSplitOptions
Output validation data file.
value - Variable in enum org.tribuo.data.DataOptions.Delimiter
The delimiter character.
valueOf(String) - Static method in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.DataOptions.Delimiter
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.DataOptions.InputFormat
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.tribuo.data.columnar.FieldProcessor.GeneratedFeatureType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.columnar.processors.feature.UniqueProcessor.UniqueType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.columnar.processors.field.RegexFieldProcessor.Mode
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.DataOptions.Delimiter
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.DataOptions.InputFormat
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.tribuo.data.text.impl.CasingPreprocessor.CasingOperation
Returns an array containing the constants of this enum type, in the order they are declared.

W

WEEK_OF_MONTH - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The week of the month, as defined by ISO 8601 semantics for week of the year.
WEEK_OF_YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The week of the year in ISO 8601.
weightExtractor - Variable in class org.tribuo.data.columnar.RowProcessor
The extractor for the example weight.
wrapFeatures(String, List<Feature>) - Static method in class org.tribuo.data.columnar.processors.field.TextFieldProcessor
Convert the Features from a text pipeline into ColumnarFeatures with the right field name.

Y

YEAR - Enum constant in enum org.tribuo.data.columnar.processors.field.DateFieldProcessor.DateFeatureType
The year.
A B C D E F G H I J L M N O P Q R S T U V W Y 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form