Packages

class HDFSLogStore extends HadoopFileSystemLogStore with Logging

The LogStore implementation for HDFS, which uses Hadoop FileContext API's to provide the necessary atomic and durability guarantees:

1. Atomic visibility of files: FileContext.rename is used write files which is atomic for HDFS.

2. Consistent file listing: HDFS file listing is consistent.

Linear Supertypes
Logging, HadoopFileSystemLogStore, LogStore, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. HDFSLogStore
  2. Logging
  3. HadoopFileSystemLogStore
  4. LogStore
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new HDFSLogStore(sparkConf: SparkConf, defaultHadoopConf: Configuration)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def createTempPath(path: Path): Path
    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def getFileContext(path: Path, hadoopConf: Configuration): FileContext
    Attributes
    protected
  12. def getHadoopConfiguration: Configuration
    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
  13. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  14. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  15. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  16. def invalidateCache(): Unit

    Invalidate any caching that the implementation may be using

    Invalidate any caching that the implementation may be using

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  17. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  18. def isPartialWriteVisible(path: Path, hadoopConf: Configuration): Boolean

    Whether a partial write is visible when writing to path.

    Whether a partial write is visible when writing to path.

    As this depends on the underlying file system implementations, we require the input of path here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.

    The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HDFSLogStoreLogStore
  19. def isPartialWriteVisible(path: Path): Boolean

    Whether a partial write is visible when writing to path.

    Whether a partial write is visible when writing to path.

    As this depends on the underlying file system implementations, we require the input of path here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.

    The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.

    Definition Classes
    HDFSLogStoreLogStore
  20. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  21. def listFrom(path: Path, hadoopConf: Configuration): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  22. def listFrom(path: Path): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  23. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  29. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  30. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  31. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  34. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  35. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  36. val noAbstractFileSystemExceptionMessage: String
  37. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  38. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  39. def read(path: Path, hadoopConf: Configuration): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  40. def read(path: Path): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  41. final def read(fileStatus: FileStatus, hadoopConf: Configuration): Seq[String]

    Load the given file represented by fileStatus and return a Seq of lines.

    Load the given file represented by fileStatus and return a Seq of lines. The line break will be removed from each line.

    Note: Using a stale FileStatus may get an incorrect result.

    Definition Classes
    LogStore
  42. def readAsIterator(path: Path, hadoopConf: Configuration): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  43. def readAsIterator(path: Path): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  44. def readAsIterator(fileStatus: FileStatus, hadoopConf: Configuration): ClosableIterator[String]

    Load the file represented by given fileStatus and return an iterator of lines.

    Load the file represented by given fileStatus and return an iterator of lines. The line break will be removed from each line.

    Note-1: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.

    Note-2: Using a stale FileStatus may get an incorrect result.

    Definition Classes
    LogStore
  45. def resolvePathOnPhysicalStorage(path: Path, hadoopConf: Configuration): Path

    Resolve the fully qualified path for the given path.

    Resolve the fully qualified path for the given path.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  46. def resolvePathOnPhysicalStorage(path: Path): Path

    Resolve the fully qualified path for the given path.

    Resolve the fully qualified path for the given path.

    Definition Classes
    HadoopFileSystemLogStoreLogStore
  47. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  48. def toString(): String
    Definition Classes
    AnyRef → Any
  49. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  51. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  52. def write(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit

    Write the given actions to the given path with or without overwrite as indicated.

    Write the given actions to the given path with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HDFSLogStoreLogStore
  53. def write(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit

    Write the given actions to the given path with or without overwrite as indicated.

    Write the given actions to the given path with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Definition Classes
    HDFSLogStoreLogStore
  54. def writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit

    An internal write implementation that uses FileSystem.rename().

    An internal write implementation that uses FileSystem.rename().

    This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.

    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore

Deprecated Value Members

  1. def getFileContext(path: Path): FileContext
    Attributes
    protected
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  2. final def listFrom(path: String): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  3. final def read(path: String): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  4. final def readAsIterator(path: String): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  5. final def write(path: String, actions: Iterator[String]): Unit

    Write the given actions to the given path without overwriting any existing file.

    Write the given actions to the given path without overwriting any existing file. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  6. def writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit

    An internal write implementation that uses FileSystem.rename().

    An internal write implementation that uses FileSystem.rename().

    This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.

    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

Inherited from Logging

Inherited from HadoopFileSystemLogStore

Inherited from LogStore

Inherited from AnyRef

Inherited from Any

Ungrouped