trait LogStore extends AnyRef
General interface for all critical file system operations required to read and write the DeltaLog. The correctness of the DeltaLog is predicated on the atomicity and durability guarantees of the implementation of this interface. Specifically,
1. Atomic visibility of files: Any file written through this store must be made visible atomically. In other words, this should not generate partial files.
2. Mutual exclusion: Only one writer must be able to create (or rename) a file at the final destination.
3. Consistent listing: Once a file has been written in a directory, all future listings for that directory must return that file.
- Alphabetic
- By Inheritance
- LogStore
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
-
abstract
def
invalidateCache(): Unit
Invalidate any caching that the implementation may be using
-
abstract
def
listFrom(path: Path): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path. The result should also be sorted by the file name.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
abstract
def
read(path: Path): Seq[String]
Load the given file and return a
Seqof lines.Load the given file and return a
Seqof lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIteratorif possible as its implementation may be more efficient.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
abstract
def
write(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit
Write the given
actionsto the givenpathwith or without overwrite as indicated.Write the given
actionsto the givenpathwith or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isPartialWriteVisible(path: Path, hadoopConf: Configuration): Boolean
Whether a partial write is visible when writing to
path.Whether a partial write is visible when writing to
path.As this depends on the underlying file system implementations, we require the input of
pathhere in order to identify the underlying file system, even though in most cases a log store only deals with one file system.The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.
Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options. -
def
listFrom(path: Path, hadoopConf: Configuration): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path. The result should also be sorted by the file name.Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options. -
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
read(fileStatus: FileStatus, hadoopConf: Configuration): Seq[String]
Load the given file represented by
fileStatusand return aSeqof lines.Load the given file represented by
fileStatusand return aSeqof lines. The line break will be removed from each line.Note: Using a stale
FileStatusmay get an incorrect result. -
def
read(path: Path, hadoopConf: Configuration): Seq[String]
Load the given file and return a
Seqof lines.Load the given file and return a
Seqof lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIteratorif possible as its implementation may be more efficient.Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options. -
def
readAsIterator(fileStatus: FileStatus, hadoopConf: Configuration): ClosableIterator[String]
Load the file represented by given fileStatus and return an iterator of lines.
Load the file represented by given fileStatus and return an iterator of lines. The line break will be removed from each line.
Note-1: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.
Note-2: Using a stale
FileStatusmay get an incorrect result. -
def
readAsIterator(path: Path, hadoopConf: Configuration): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
readto load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.
Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options. -
def
resolvePathOnPhysicalStorage(path: Path, hadoopConf: Configuration): Path
Resolve the fully qualified path for the given
path.Resolve the fully qualified path for the given
path.Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options. -
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
write(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit
Write the given
actionsto the givenpathwith or without overwrite as indicated.Write the given
actionsto the givenpathwith or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.Note: The default implementation ignores the
hadoopConfparameter to provide the backward compatibility. Subclasses should override this method and usehadoopConfproperly to support passing Hadoop file system configurations through DataFrame options.
Deprecated Value Members
-
def
isPartialWriteVisible(path: Path): Boolean
Whether a partial write is visible when writing to
path.Whether a partial write is visible when writing to
path.As this depends on the underlying file system implementations, we require the input of
pathhere in order to identify the underlying file system, even though in most cases a log store only deals with one file system.The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
listFrom(path: String): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path. The result should also be sorted by the file name.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
read(path: String): Seq[String]
Load the given file and return a
Seqof lines.Load the given file and return a
Seqof lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIteratorif possible as its implementation may be more efficient.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
def
readAsIterator(path: Path): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
readto load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
readAsIterator(path: String): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
readto load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
def
resolvePathOnPhysicalStorage(path: Path): Path
Resolve the fully qualified path for the given
path.Resolve the fully qualified path for the given
path.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
write(path: String, actions: Iterator[String]): Unit
Write the given
actionsto the givenpathwithout overwriting any existing file.Write the given
actionsto the givenpathwithout overwriting any existing file. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead