Class PennTreebankText

  • All Implemented Interfaces:
    ai.djl.training.dataset.Dataset

    public class PennTreebankText
    extends TextDataset
    The Penn Treebank (PTB) project selected 2,499 stories from a three year Wall Street Journal (WSJ) collection of 98,732 stories for syntactic annotation (see here for details).
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  PennTreebankText.Builder
      A builder to construct a PennTreebankText .
      • Nested classes/interfaces inherited from class ai.djl.training.dataset.RandomAccessDataset

        ai.djl.training.dataset.RandomAccessDataset.BaseBuilder<T extends ai.djl.training.dataset.RandomAccessDataset.BaseBuilder<T>>
      • Nested classes/interfaces inherited from interface ai.djl.training.dataset.Dataset

        ai.djl.training.dataset.Dataset.Usage
    • Method Detail

      • get

        public ai.djl.training.dataset.Record get​(ai.djl.ndarray.NDManager manager,
                                                  long index)
                                           throws java.io.IOException
        Specified by:
        get in class ai.djl.training.dataset.RandomAccessDataset
        Throws:
        java.io.IOException
      • availableSize

        protected long availableSize()
        Specified by:
        availableSize in class ai.djl.training.dataset.RandomAccessDataset
      • prepare

        public void prepare​(ai.djl.util.Progress progress)
                     throws java.io.IOException,
                            ai.djl.modality.nlp.embedding.EmbeddingException
        Prepares the dataset for use with tracked progress.
        Parameters:
        progress - the progress tracker
        Throws:
        java.io.IOException - for various exceptions depending on the dataset
        ai.djl.modality.nlp.embedding.EmbeddingException