All Classes and Interfaces
Class
Description
A feature aggregator that averages feature values across a feature list.
An example implementation of
TextPipeline.A
ResponseProcessor that takes a single value of the
field as the positive class and all other values as the negative
class.A document preprocessor which uppercases or lowercases the input.
The possible casing operations.
A
ConfigurableDataSource base class which takes columnar data (e.g., csv or DB table rows) and generates Examples.A Feature with extra bookkeeping for use inside the columnar package.
An abstract class for iterators that read data in to a columnar format, usually from a file of some kind.
A representation of a row of untyped data from a columnar data source.
Build and run a predictor for a standard dataset.
Command line options.
Build and run a predictor for a standard dataset.
Command line options.
A
DataSource for loading separable data from a text file (e.g., CSV, TSV)
and applying FieldProcessors to it.Provenance for
CSVDataSource.An iterator over a CSV file.
Load a DataSource/Dataset from a CSV file.
Deprecated.
Saves a Dataset in CSV format suitable for loading by
CSVLoader.Options for working with training and test data in a CLI.
The delimiters supported by CSV files in this options object.
The input formats supported by this options object.
A CLI for exploring a serialised
Dataset.Command line options.
Extracts the field value and translates it to a
LocalDate based on the specified DateTimeFormatter.Processes a column that contains a date value.
The types of date features which can be extracted.
A data source for a somewhat-common format for text classification datasets:
a top level directory that contains a number of subdirectories.
Provenance for
DirectoryFileSource.An interface for things that can pre-process documents before they are
broken into features.
Extracts the field value and converts it to a double.
Processes a column that contains a real value.
A
ResponseProcessor that always emits an empty optional.An interface for aggregating feature values into other values.
Hashes the feature names to reduce the dimensionality.
Takes a list of columnar features and adds new features or removes existing features.
A feature transformer maps a list of features to a new list of features.
Extracts a value from a field to be placed in an
Example's metadata field.An interface for things that process the columns in a data set.
The types of generated features.
A response processor that returns the value(s) in a given (set of) fields.
Extracts the field value and converts it to a float.
Extracts the field value and emits it as a String.
A
FieldProcessor which converts the field name and value into a feature with a value of IdentityProcessor.FEATURE_VALUE.An Extractor with special casing for loading the index from a Row.
Extracts the field value and converts it to a int.
A document pre-processor for 20 newsgroup data.
A text processor that will generate token ngrams of a particular size.
Extracts the field value and translates it to an
OffsetDateTime based on the specified DateTimeFormatter.Reads in a Datasource, processes all the data, and writes it out as a serialized dataset.
Command line options.
A quartile to split data into 4 chunks.
Processes the response into quartiles and emits them as classification outputs.
A
FieldProcessor which applies a regex to a field and generates ColumnarFeatures based on the matches.Matching mode.
A simple document preprocessor which applies regular expressions to the input.
An interface that will take the response field and produce an
Output.An iterator over a ResultSet returned from JDBC.
A processor which takes a Map of String to String and returns an
Example.Builder for
RowProcessor.Extracts a value from a single field to be placed in an
Example's metadata field.A version of
SimpleTextDataSource that accepts a List of Strings.Provenance for
SimpleStringDataSource.A dataset for a simple data format for text classification experiments.
Provenance for
SimpleTextDataSource.Splits data in our standard text format into training and testing portions.
Command line options.
A
DataSource for loading columnar data from a database
and applying FieldProcessors to it.Provenance for
SQLDataSource.N.B.
Read an SQL query in on the standard input, write a CSV file containing the
results to the standard output.
Command line options.
A feature aggregator that aggregates occurrence counts across a number of
feature lists.
A base class for textual data sets.
An interface for things that take text and turn them into examples that we
can use to train or evaluate a classifier.
A
FieldProcessor which takes a text field and runs a TextPipeline on it
to generate features.A pipeline that takes a String and returns a List of
Features.An exception thrown by the text processing system.
A TextProcessor takes some text and optionally a feature tag and generates a list of
Features from that text.A pipeline for generating ngram features.
Aggregates feature tokens, generating unique features.
Processes a feature list, aggregating all the feature values with the same name.
The type of reduction operation to perform.
CSVDataSource.