class BufferingLogDeletionIterator extends Iterator[FileStatus]
An iterator that helps select old log files for deletion. It takes the input iterator of log files from the earliest file, and returns should-be-deleted files until the given maxTimestamp or maxVersion to delete is reached. Note that this iterator may stop deleting files earlier than maxTimestamp or maxVersion if it finds that files that need to be preserved for adjusting the timestamps of subsequent files. Let's go through an example. Assume the following commit history:
+---------+-----------+--------------------+
Version | Timestamp | Adjusted Timestamp |
|---|
+---------+-----------+--------------------+
0 | 0 | 0 |
|---|---|---|
2 | 10 | 10 |
3 | 7 | 11 |
4 | 8 | 12 |
5 | 14 | 14 |
+---------+-----------+--------------------+
As you can see from the example, we require timestamps to be monotonically increasing with respect to the version of the commit, and each commit to have a unique timestamp. If we have a commit which doesn't obey one of these two requirements, we adjust the timestamp of that commit to be one millisecond greater than the previous commit.
Given the above commit history, the behavior of this iterator will be as follows:
- For maxVersion = 1 and maxTimestamp = 9, we can delete versions 0 and 1
- Until we receive maxVersion >= 4 and maxTimestamp >= 12, we can't delete versions 2 and 3. This is because version 2 is used to adjust the timestamps of commits up to version 4.
- For maxVersion >= 5 and maxTimestamp >= 14 we can delete everything The semantics of time travel guarantee that for a given timestamp, the user will ALWAYS get the same version. Consider a user asks to get the version at timestamp 11. If all files are there, we would return version 3 (timestamp 11) for this query. If we delete versions 0-2, the original timestamp of version 3 (7) will not have an anchor to adjust on, and if the time travel query is re-executed we would return version 4. This is the motivation behind this iterator implementation.
The implementation maintains an internal "maybeDelete" buffer of files that we are unsure of deleting because they may be necessary to adjust time of future files. For each file we get from the underlying iterator, we check whether it needs time adjustment or not. If it does need time adjustment, then we cannot immediately decide whether it is safe to delete that file or not and therefore we put it in each the buffer. Then we iteratively peek ahead at the future files and accordingly decide whether to delete all the buffered files or retain them.
- Alphabetic
- By Inheritance
- BufferingLogDeletionIterator
- Iterator
- TraversableOnce
- GenTraversableOnce
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
BufferingLogDeletionIterator(underlying: Iterator[FileStatus], maxTimestamp: Long, maxVersion: Long, versionGetter: (Path) ⇒ Long)
- underlying
The iterator which gives the list of files in ascending version order
- maxTimestamp
The timestamp until which we can delete (inclusive).
- maxVersion
The version until which we can delete (inclusive).
- versionGetter
A method to get the commit version from the file path.
Type Members
-
class
GroupedIterator[B >: A] extends AbstractIterator[Seq[B]] with Iterator[Seq[B]]
- Definition Classes
- Iterator
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
def
++[B >: FileStatus](that: ⇒ GenTraversableOnce[B]): Iterator[B]
- Definition Classes
- Iterator
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
addString(b: StringBuilder): StringBuilder
- Definition Classes
- TraversableOnce
-
def
addString(b: StringBuilder, sep: String): StringBuilder
- Definition Classes
- TraversableOnce
-
def
addString(b: StringBuilder, start: String, sep: String, end: String): StringBuilder
- Definition Classes
- TraversableOnce
-
def
aggregate[B](z: ⇒ B)(seqop: (B, FileStatus) ⇒ B, combop: (B, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
buffered: BufferedIterator[FileStatus]
- Definition Classes
- Iterator
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
collect[B](pf: PartialFunction[FileStatus, B]): Iterator[B]
- Definition Classes
- Iterator
- Annotations
- @migration
- Migration
(Changed in version 2.8.0)
collecthas changed. The previous behavior can be reproduced withtoSeq.
-
def
collectFirst[B](pf: PartialFunction[FileStatus, B]): Option[B]
- Definition Classes
- TraversableOnce
-
def
contains(elem: Any): Boolean
- Definition Classes
- Iterator
-
def
copyToArray[B >: FileStatus](xs: Array[B], start: Int, len: Int): Unit
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
copyToArray[B >: FileStatus](xs: Array[B]): Unit
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
copyToArray[B >: FileStatus](xs: Array[B], start: Int): Unit
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
copyToBuffer[B >: FileStatus](dest: Buffer[B]): Unit
- Definition Classes
- TraversableOnce
-
def
corresponds[B](that: GenTraversableOnce[B])(p: (FileStatus, B) ⇒ Boolean): Boolean
- Definition Classes
- Iterator
-
def
count(p: (FileStatus) ⇒ Boolean): Int
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
drop(n: Int): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
dropWhile(p: (FileStatus) ⇒ Boolean): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
duplicate: (Iterator[FileStatus], Iterator[FileStatus])
- Definition Classes
- Iterator
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
exists(p: (FileStatus) ⇒ Boolean): Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
filter(p: (FileStatus) ⇒ Boolean): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
filterNot(p: (FileStatus) ⇒ Boolean): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
find(p: (FileStatus) ⇒ Boolean): Option[FileStatus]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
flatMap[B](f: (FileStatus) ⇒ GenTraversableOnce[B]): Iterator[B]
- Definition Classes
- Iterator
-
def
fold[A1 >: FileStatus](z: A1)(op: (A1, A1) ⇒ A1): A1
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
foldLeft[B](z: B)(op: (B, FileStatus) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
foldRight[B](z: B)(op: (FileStatus, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
forall(p: (FileStatus) ⇒ Boolean): Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
foreach[U](f: (FileStatus) ⇒ U): Unit
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
grouped[B >: FileStatus](size: Int): GroupedIterator[B]
- Definition Classes
- Iterator
-
def
hasDefiniteSize: Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
hasNext: Boolean
- Definition Classes
- BufferingLogDeletionIterator → Iterator
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
indexOf[B >: FileStatus](elem: B, from: Int): Int
- Definition Classes
- Iterator
-
def
indexOf[B >: FileStatus](elem: B): Int
- Definition Classes
- Iterator
-
def
indexWhere(p: (FileStatus) ⇒ Boolean, from: Int): Int
- Definition Classes
- Iterator
-
def
indexWhere(p: (FileStatus) ⇒ Boolean): Int
- Definition Classes
- Iterator
-
def
isEmpty: Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraversableAgain: Boolean
- Definition Classes
- Iterator → GenTraversableOnce
-
def
length: Int
- Definition Classes
- Iterator
-
def
map[B](f: (FileStatus) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
max[B >: FileStatus](implicit cmp: Ordering[B]): FileStatus
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
maxBy[B](f: (FileStatus) ⇒ B)(implicit cmp: Ordering[B]): FileStatus
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
min[B >: FileStatus](implicit cmp: Ordering[B]): FileStatus
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
minBy[B](f: (FileStatus) ⇒ B)(implicit cmp: Ordering[B]): FileStatus
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString: String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString(sep: String): String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString(start: String, sep: String, end: String): String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
next(): FileStatus
- Definition Classes
- BufferingLogDeletionIterator → Iterator
-
def
nonEmpty: Boolean
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
padTo[A1 >: FileStatus](len: Int, elem: A1): Iterator[A1]
- Definition Classes
- Iterator
-
def
partition(p: (FileStatus) ⇒ Boolean): (Iterator[FileStatus], Iterator[FileStatus])
- Definition Classes
- Iterator
-
def
patch[B >: FileStatus](from: Int, patchElems: Iterator[B], replaced: Int): Iterator[B]
- Definition Classes
- Iterator
-
def
product[B >: FileStatus](implicit num: Numeric[B]): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduce[A1 >: FileStatus](op: (A1, A1) ⇒ A1): A1
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceLeft[B >: FileStatus](op: (B, FileStatus) ⇒ B): B
- Definition Classes
- TraversableOnce
-
def
reduceLeftOption[B >: FileStatus](op: (B, FileStatus) ⇒ B): Option[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceOption[A1 >: FileStatus](op: (A1, A1) ⇒ A1): Option[A1]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceRight[B >: FileStatus](op: (FileStatus, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceRightOption[B >: FileStatus](op: (FileStatus, B) ⇒ B): Option[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reversed: List[FileStatus]
- Attributes
- protected[this]
- Definition Classes
- TraversableOnce
-
def
sameElements(that: Iterator[_]): Boolean
- Definition Classes
- Iterator
-
def
scanLeft[B](z: B)(op: (B, FileStatus) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
scanRight[B](z: B)(op: (FileStatus, B) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
seq: Iterator[FileStatus]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
size: Int
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
sizeHintIfCheap: Int
- Attributes
- protected[collection]
- Definition Classes
- GenTraversableOnce
-
def
slice(from: Int, until: Int): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
sliceIterator(from: Int, until: Int): Iterator[FileStatus]
- Attributes
- protected
- Definition Classes
- Iterator
-
def
sliding[B >: FileStatus](size: Int, step: Int): GroupedIterator[B]
- Definition Classes
- Iterator
-
def
span(p: (FileStatus) ⇒ Boolean): (Iterator[FileStatus], Iterator[FileStatus])
- Definition Classes
- Iterator
-
def
sum[B >: FileStatus](implicit num: Numeric[B]): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
take(n: Int): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
takeWhile(p: (FileStatus) ⇒ Boolean): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
to[Col[_]](implicit cbf: CanBuildFrom[Nothing, FileStatus, Col[FileStatus]]): Col[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toArray[B >: FileStatus](implicit arg0: ClassTag[B]): Array[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toBuffer[B >: FileStatus]: Buffer[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIndexedSeq: IndexedSeq[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIterable: Iterable[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIterator: Iterator[FileStatus]
- Definition Classes
- Iterator → GenTraversableOnce
-
def
toList: List[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toMap[T, U](implicit ev: <:<[FileStatus, (T, U)]): Map[T, U]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toSeq: Seq[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toSet[B >: FileStatus]: Set[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toStream: Stream[FileStatus]
- Definition Classes
- Iterator → GenTraversableOnce
-
def
toString(): String
- Definition Classes
- Iterator → AnyRef → Any
-
def
toTraversable: Traversable[FileStatus]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
toVector: Vector[FileStatus]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withFilter(p: (FileStatus) ⇒ Boolean): Iterator[FileStatus]
- Definition Classes
- Iterator
-
def
zip[B](that: Iterator[B]): Iterator[(FileStatus, B)]
- Definition Classes
- Iterator
-
def
zipAll[B, A1 >: FileStatus, B1 >: B](that: Iterator[B], thisElem: A1, thatElem: B1): Iterator[(A1, B1)]
- Definition Classes
- Iterator
-
def
zipWithIndex: Iterator[(FileStatus, Int)]
- Definition Classes
- Iterator
Deprecated Value Members
-
def
/:[B](z: B)(op: (B, FileStatus) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
- Annotations
- @deprecated
- Deprecated
(Since version 2.12.10) Use foldLeft instead of /:
-
def
:\[B](z: B)(op: (FileStatus, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
- Annotations
- @deprecated
- Deprecated
(Since version 2.12.10) Use foldRight instead of :\