Class ClusteringVectorParser

  • All Implemented Interfaces:
    elki.datasource.bundle.BundleStreamSource, elki.datasource.parser.Parser, elki.datasource.parser.StreamingParser

    public class ClusteringVectorParser
    extends elki.datasource.parser.AbstractStreamingParser
    Parser for simple clustering results in vector form, as written by ClusteringVectorDumper.

    This allows reading the output of multiple clustering runs, and analyze the results using ELKI algorithm.

    The input format is very simple, each line containing a sequence of cluster assignments in integer form, and an optional label:

     0 0 1 1 0 First
     0 0 0 1 2 Second
     
    represents two clusterings for 5 objects. The first clustering has two clusters, the second contains three clusters.

    TODO: this parser currently is quite hacky, and could use a cleanup.

    TODO: support noise, via negative cluster numbers?

    Since:
    0.7.0
    Author:
    Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  ClusteringVectorParser.Par
      Parameterization class.
      • Nested classes/interfaces inherited from interface elki.datasource.bundle.BundleStreamSource

        elki.datasource.bundle.BundleStreamSource.Event
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) it.unimi.dsi.fastutil.ints.IntArrayList buf1
      Buffers, will be reused.
      (package private) Clustering<Model> curclu
      Current clustering.
      (package private) elki.data.LabelList curlbl
      Current labels.
      (package private) boolean haslbl
      Flag if labels are present.
      (package private) java.util.ArrayList<java.lang.String> lbl
      Buffer for labels.
      private static elki.logging.Logging LOG
      Class logger.
      protected elki.datasource.bundle.BundleMeta meta
      Metadata.
      (package private) elki.datasource.bundle.BundleStreamSource.Event nextevent
      Event to report next.
      (package private) int numterms
      Number of different terms observed.
      (package private) elki.database.ids.DBIDRange range
      Range of the DBID values.
      • Fields inherited from class elki.datasource.parser.AbstractStreamingParser

        reader, tokenizer
    • Constructor Summary

      Constructors 
      Constructor Description
      ClusteringVectorParser​(elki.datasource.parser.CSVReaderFormat format)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.Object data​(int rnum)  
      protected elki.logging.Logging getLogger()  
      elki.datasource.bundle.BundleMeta getMeta()  
      void initStream​(java.io.InputStream in)  
      elki.datasource.bundle.BundleStreamSource.Event nextEvent()  
      • Methods inherited from class elki.datasource.parser.AbstractStreamingParser

        asMultipleObjectsBundle, assignDBID, cleanup, hasDBIDs, parse
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        Class logger.
      • numterms

        int numterms
        Number of different terms observed.
      • meta

        protected elki.datasource.bundle.BundleMeta meta
        Metadata.
      • nextevent

        elki.datasource.bundle.BundleStreamSource.Event nextevent
        Event to report next.
      • curlbl

        elki.data.LabelList curlbl
        Current labels.
      • buf1

        it.unimi.dsi.fastutil.ints.IntArrayList buf1
        Buffers, will be reused.
      • range

        elki.database.ids.DBIDRange range
        Range of the DBID values.
      • lbl

        java.util.ArrayList<java.lang.String> lbl
        Buffer for labels.
      • haslbl

        boolean haslbl
        Flag if labels are present.
    • Constructor Detail

      • ClusteringVectorParser

        public ClusteringVectorParser​(elki.datasource.parser.CSVReaderFormat format)
        Constructor.
        Parameters:
        format - Input format
    • Method Detail

      • initStream

        public void initStream​(java.io.InputStream in)
        Specified by:
        initStream in interface elki.datasource.parser.StreamingParser
        Overrides:
        initStream in class elki.datasource.parser.AbstractStreamingParser
      • nextEvent

        public elki.datasource.bundle.BundleStreamSource.Event nextEvent()
      • data

        public java.lang.Object data​(int rnum)
      • getMeta

        public elki.datasource.bundle.BundleMeta getMeta()
      • getLogger

        protected elki.logging.Logging getLogger()
        Specified by:
        getLogger in class elki.datasource.parser.AbstractStreamingParser