Package elki.datasource.parser
Class SimpleTransactionParser
- java.lang.Object
-
- elki.datasource.parser.AbstractStreamingParser
-
- elki.datasource.parser.SimpleTransactionParser
-
- All Implemented Interfaces:
elki.datasource.bundle.BundleStreamSource,Parser,StreamingParser
public class SimpleTransactionParser extends AbstractStreamingParser
Simple parser for transactional data, such as market baskets.To keep the input format simple and readable, all tokens are assumed to be of text and separated by whitespace, and each transaction is on a separate line.
An example file containing two transactions looks like this
bread butter milk paste tomato basil
TODO: add a parameter to, e.g., use the first or last entry as labels instead of tokens.- Since:
- 0.7.0
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classSimpleTransactionParser.ParParameterization class.
-
Field Summary
Fields Modifier and Type Field Description (package private) it.unimi.dsi.fastutil.longs.LongArrayListbufBuffer, will be reused.(package private) elki.data.BitVectorcurvecCurrent vector.(package private) it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap<java.lang.String>keymapMap.private static elki.logging.LoggingLOGClass logger.protected elki.datasource.bundle.BundleMetametaMetadata.(package private) elki.datasource.bundle.BundleStreamSource.EventnexteventEvent to report next.(package private) intnumtermsNumber of different terms observed.-
Fields inherited from class elki.datasource.parser.AbstractStreamingParser
reader, tokenizer
-
-
Constructor Summary
Constructors Constructor Description SimpleTransactionParser(CSVReaderFormat format)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcleanup()Perform cleanup operations after parsing.java.lang.Objectdata(int rnum)protected elki.logging.LogginggetLogger()Get the logger for this class.elki.datasource.bundle.BundleMetagetMeta()voidinitStream(java.io.InputStream in)Init the streaming parser for the given input stream.elki.datasource.bundle.BundleStreamSource.EventnextEvent()-
Methods inherited from class elki.datasource.parser.AbstractStreamingParser
asMultipleObjectsBundle, assignDBID, hasDBIDs, parse
-
-
-
-
Field Detail
-
LOG
private static final elki.logging.Logging LOG
Class logger.
-
numterms
int numterms
Number of different terms observed.
-
keymap
it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap<java.lang.String> keymap
Map.
-
meta
protected elki.datasource.bundle.BundleMeta meta
Metadata.
-
nextevent
elki.datasource.bundle.BundleStreamSource.Event nextevent
Event to report next.
-
curvec
elki.data.BitVector curvec
Current vector.
-
buf
it.unimi.dsi.fastutil.longs.LongArrayList buf
Buffer, will be reused.
-
-
Constructor Detail
-
SimpleTransactionParser
public SimpleTransactionParser(CSVReaderFormat format)
Constructor.- Parameters:
format- Input format
-
-
Method Detail
-
initStream
public void initStream(java.io.InputStream in)
Description copied from interface:StreamingParserInit the streaming parser for the given input stream.- Specified by:
initStreamin interfaceStreamingParser- Overrides:
initStreamin classAbstractStreamingParser- Parameters:
in- the stream to parse objects from
-
nextEvent
public elki.datasource.bundle.BundleStreamSource.Event nextEvent()
-
cleanup
public void cleanup()
Description copied from interface:ParserPerform cleanup operations after parsing.- Specified by:
cleanupin interfaceParser- Overrides:
cleanupin classAbstractStreamingParser
-
data
public java.lang.Object data(int rnum)
-
getMeta
public elki.datasource.bundle.BundleMeta getMeta()
-
getLogger
protected elki.logging.Logging getLogger()
Description copied from class:AbstractStreamingParserGet the logger for this class.- Specified by:
getLoggerin classAbstractStreamingParser- Returns:
- Logger.
-
-