Class SegmentDirectory

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public abstract class SegmentDirectory
    extends Object
    implements Closeable
    Basic top-level interface to access segment indexes. Usage:
       
         SegmentDirectory segmentDir =
                SegmentDirectory.createFromLocalFS(dirName, segmentMetadata, ReadMode.mmap);
         SegmentDirectory.Writer writer =
              segmentDir.createWriter();
         try {
           writer.getIndexFor("column1", ColumnIndexType.FORWARD_INDEX);
           PinotDataBufferOld buffer =
               writer.newIndexFor("column1", ColumnIndexType.FORWARD_INDEX, 1024);
           // write value 87 at index 512
           buffer.putLong(512, 87L);
           writer.saveAndClose();
         } finally {
           writer.close();
         }
    
         SegmentDirectory.Reader reader =
               segmentDir.createReader();
         try {
           PinotDataBufferOld col1Dictionary = reader.getIndexFor("col1Dictionary", ColumnIndexType.DICTIONARY);
         } catch (Exception e) {
           // handle error
         } finally {
           reader.close();
         }
    
         // this should be in finally{} block
         segmentDir.close();
       
     
    Typical use cases for Pinot: 1. Read existing indexes 2. Read forward index and create new inverted index 3. drop inverted index 4. Create dictionary, forward index and inverted index. Semantics: =========== The semantics below are explicitly tied to the use cases above. Typically, you should cluster all the writes at the beginning (before reads). After writing, save/close the writer and create reader for reads. saveAndClose() is a costly operation. Reading after writes triggers full reload of data so use it with caution. For pinot, this is a costly operation performed only at the initialization time so the penalty is acceptable. 1. Single writer, multiple reader semantics 2. Writes are not visible till the user calls saveAndClose() 3. saveAndClose() is costly! Creating readers after writers is a costly operation. 4. saveAndClose() does not guarantee atomicity. Failures during saveAndClose() can leave the directory corrupted. 5. SegmentDirectory controls placement of data. User should not make any assumptions about data storage. 6. Use factory-methods to instantiate SegmentDirectory. This is with the goal of supporting networked/distributed file system reads in the future. All things said, users can always get the bytebuffers through readers and change contents. If these buffers are mmapped then the changes will reflect in the segment storage.
    • Constructor Detail

      • SegmentDirectory

        protected SegmentDirectory()
    • Method Detail

      • getIndexDir

        public abstract URI getIndexDir()
      • reloadMetadata

        public abstract void reloadMetadata()
                                     throws Exception
        Throws:
        Exception
      • getPath

        public abstract Path getPath()
        Get the path/URL for the directory
      • getDiskSizeBytes

        public abstract long getDiskSizeBytes()
      • getColumnsWithIndex

        public abstract Set<String> getColumnsWithIndex​(ColumnIndexType type)
        Get the columns with specific index type, in this local segment directory.
        Returns:
        a set of columns with such index type.
      • prefetch

        public void prefetch​(FetchContext fetchContext)
        This is a hint to the segment directory, to begin prefetching buffers for given context. Typically, this should be an async call made before operating on the segment.
        Parameters:
        fetchContext - context for this segment's fetch
      • acquire

        public void acquire​(FetchContext fetchContext)
        This is an instruction to the segment directory, to fetch buffers for the given context. When enabled, this should be a blocking call made before operating on the segment.
        Parameters:
        fetchContext - context for this segment's fetch
      • release

        public void release​(FetchContext fetchContext)
        This is an instruction to the segment directory to release the fetched buffers for given context. When enabled, this should be a call made after operating on the segment. It is possible that this called multiple times.
        Parameters:
        fetchContext - context for this segment's fetch
      • copyTo

        public void copyTo​(File dest)
                    throws Exception
        Copy segment directory to a local directory.
        Parameters:
        dest - the destination directory
        Throws:
        Exception
      • getTier

        @Nullable
        public abstract String getTier()
        Get the storage tier where the segment directory is placed by server.
        Returns:
        storage tier, null by default.
      • setTier

        public abstract void setTier​(@Nullable
                                     String tier)
        Set the storage tier where the segment directory is placed by server.
      • createReader

        public abstract SegmentDirectory.Reader createReader()
                                                      throws IOException,
                                                             org.apache.commons.configuration.ConfigurationException
        Create Reader for the directory
        Returns:
        Reader object if successfully created. null if the directory is already locked for writes
        Throws:
        IOException
        org.apache.commons.configuration.ConfigurationException