R - The type of the streampublic interface CacheStream<R> extends Stream<R>
Stream that has additional operations to monitor or control behavior when used from a Cache. Note that
you may only use these additional methods on the CacheStream before any intermediate operations are performed as
a Stream is returned from those methods.
Whenever the iterator or spliterator methods are used the user must close the Stream
that the method was invoked on after completion of its operation. Failure to do so may cause a thread leakage if
the iterator or spliterator are not fully consumed.
When using stream that is backed by a distributed cache these operations will be performed using remote distribution controlled by the segments that each key maps to. All intermediate operations are lazy, even the special cases described in later paragraphs and are not evaluated until a final terminal operation is invoked on the stream. Essentially each set of intermediate operations is shipped to each remote node where they are applied to a local stream there and finally the terminal operation is completed. If this stream is parallel the processing on remote nodes is also done using a parallel stream.
Parallel distribution is enabled by default for all operations except for iterator() &
spliterator(). Please see sequentialDistribution() and
parallelDistribution(). With this disabled only a single node will process the operation
at a time (includes locally).
Rehash aware is enabled by default for all operations which will provide guaranteed consistency for all operations
except for forEach(Consumer). Please the method above for details about its consistency
guarantees. If you wish to disable rehash aware operations you can disable them by calling
disableRehashAware() which should provide better performance for some operations. The
performance is most affected for the key aware operations iterator(),
spliterator(), forEach(Consumer)
Some terminal operators are special in that they act like an intermediate iterator operation. That is that it is an intermediate operation, but it requires processing the results using an interator intermediately before the stream can complete.
A good example of an intermediate iterator operation is using distinct intermediate operation. What will happen
is upon calling the terminal operation an iterator operation will be ran using all of
the intermediate operations up to the distinct operation remotely. This iterator is then used to fuel a local
stream where all of the remaining intermediate operations are performed and then finally the terminal operation is
applied as normal. Note in this case the intermediate iterator still obeys the
distributedBatchSize(int) setting irrespective of the terminal operator.
| Modifier and Type | Interface and Description |
|---|---|
static interface |
CacheStream.SegmentCompletionListener
Functional interface that is used as a callback when segments are completed.
|
Stream.Builder<T>| Modifier and Type | Method and Description |
|---|---|
CacheStream<R> |
disableRehashAware()
Disables tracking of rehash events that could occur to the underlying cache.
|
Stream<R> |
distinct() |
CacheStream<R> |
distributedBatchSize(int batchSize)
Controls how many keys are returned from a remote node when using a stream terminal operation with a distributed
cache to back this stream.
|
CacheStream<R> |
filterKeys(Set<?> keys)
Filters which entries are returned by only returning ones that map to the given key.
|
CacheStream<R> |
filterKeySegments(Set<Integer> segments)
Filters which entries are returned by what segment they are present in.
|
void |
forEach(Consumer<? super R> action) |
Iterator<R> |
iterator() |
Stream<R> |
limit(long maxSize) |
CacheStream<R> |
parallelDistribution()
This would enable sending requests to all other remote nodes when a terminal operator is performed.
|
CacheStream<R> |
segmentCompletionListener(CacheStream.SegmentCompletionListener listener)
Allows registration of a segment completion listener that is notified when a segment has completed
processing.
|
CacheStream<R> |
sequentialDistribution()
This would disable sending requests to all other remote nodes compared to one at a time.
|
Stream<R> |
skip(long n) |
Stream<R> |
sorted() |
Stream<R> |
sorted(Comparator<? super R> comparator) |
Spliterator<R> |
spliterator() |
allMatch, anyMatch, builder, collect, collect, concat, count, empty, filter, findAny, findFirst, flatMap, flatMapToDouble, flatMapToInt, flatMapToLong, forEachOrdered, generate, iterate, map, mapToDouble, mapToInt, mapToLong, max, min, noneMatch, of, of, peek, reduce, reduce, reduce, toArray, toArrayclose, isParallel, onClose, parallel, sequential, unorderedCacheStream<R> sequentialDistribution()
Parallel distribution is enabled by default except for iterator() &
spliterator()
CacheStream<R> parallelDistribution()
Parallel distribution is enabled by default except for iterator() &
spliterator()
CacheStream<R> filterKeySegments(Set<Integer> segments)
Stream.filter(Predicate) method as this can control what nodes are
asked for data and what entries are read from the underlying CacheStore if present.segments - The segments to use for this stream operation. Any segments not in this set will be ignored.CacheStream<R> filterKeys(Set<?> keys)
Stream.filter(Predicate) if any keys must be retrieved remotely or if a
cache store is in use.keys - The keys that this stream will only operate on.CacheStream<R> distributedBatchSize(int batchSize)
iterator(), spliterator(),
forEach(Consumer). Please see those methods for additional information on how this value
may affect them.
This value may be used in the case of a a terminal operator that doesn't track keys if an intermediate
operation is performed that requires bringing keys locally to do computations. Examples of such intermediate
operations are sorted(), sorted(Comparator),
distinct(), limit(long), skip(long)
This value is always ignored when this stream is backed by a cache that is not distributed as all values are already local.
batchSize - The size of each batch. This defaults to the state transfer chunk size.CacheStream<R> segmentCompletionListener(CacheStream.SegmentCompletionListener listener)
This method is designed for the sole purpose of use with the iterator() to allow for
a user to track completion of segments as they are returned from the iterator. Behavior of other methods
is not specified. Please see iterator() for more information.
Multiple listeners may be registered upon multiple invocations of this method. The ordering of notified listeners is not specified.
listener - The listener that will be called back as segments are completed.CacheStream<R> disableRehashAware()
Most terminal operations will run faster with rehash awareness disabled even without a rehash occuring. However if a rehash occurs with this disabled be prepared to possibly receive only a subset of values.
void forEach(Consumer<? super R> action)
This operation is performed remotely on the node that is the primary owner for the key tied to the entry(s) in this stream.
NOTE: This method while being rehash aware has the lowest consistency of all of the operators. This
operation will be performed on every entry at least once in the cluster, as long as the originator doesn't go
down while it is being performed. This is due to how the distributed action is performed. Essentially the
distributedBatchSize(int) value controls how many elements are processed per node at a time
when rehash is enabled. After those are complete the keys are sent to the originator to confirm that those were
processed. If that node goes down during/before the response those keys will be processed a second time.
This method is ran distributed by default with a distributed backing cache. However if you wish for this
operation to run locally you can use the iterator() method to return all of the results
locally and then use Iterator.forEachRemaining(Consumer) method for a single threaded variant. If you
wish to have a parallel variant you can use StreamSupport.stream(Spliterator, boolean)
passing in the spliterator from the stream. In either case remember you must close the stream after
you are done processing the iterator or spliterator..
Iterator<R> iterator()
Usage of this operator requires closing this stream after you are done with the iterator. The preferred usage is to use a try with resource block on the stream.
This method has special usage with the CacheStream.SegmentCompletionListener in
that as entries are retrieved from the next method it will complete segments.
This method obeys the distributedBatchSize(int) setting by only ever returning the
elements that mapped to that many keys. Note that when using methods such as
Stream.flatMap(Function) that you will have possibly more than 1 element mapped to a given key
so this doesn't guarantee that many number of entries are returned per batch.
iterator in interface BaseStream<R,Stream<R>>Spliterator<R> spliterator()
Usage of this operator requires closing this stream after you are done with the spliterator. The preferred usage is to use a try with resource block on the stream.
spliterator in interface BaseStream<R,Stream<R>>Stream<R> sorted()
This method has special usage when used with a distributed cache backing this set. This operation will act
as an intermediate iterator operation requiring data be brought locally for proper behavior. This is
described in more detail in the CacheStream documentation
This intermediate iterator operation will be performed locally only requiring all elements to be in memory
Stream<R> sorted(Comparator<? super R> comparator)
This method has special usage when used with a distributed cache backing this set. This operation will act
as an intermediate iterator operation requiring data be brought locally for proper behavior. This is
described in more detail in the CacheStream documentation
This intermediate iterator operation will be performed locally only requiring all elements to be in memory
Stream<R> limit(long maxSize)
This method has special usage when used with a distributed cache backing this set. This operation will act
as an intermediate iterator operation requiring data be brought locally for proper behavior. This is
described in more detail in the CacheStream documentation
This intermediate iterator operation will be performed both remotely and locally to reduce how many elements are sent back from each node.
Stream<R> skip(long n)
This method has special usage when used with a distributed cache backing this set. This operation will act
as an intermediate iterator operation requiring data be brought locally for proper behavior. This is
described in more detail in the CacheStream documentation
This intermediate iterator operation will only be performed locally, however it will only have elements in
memory controlled by the distributedBatchSize(int) unless the terminal operator holds them.
Stream<R> distinct()
This method has special usage when used with a distributed cache backing this set. This operation will act
as an intermediate iterator operation requiring data be brought locally for proper behavior. This is
described in more detail in the CacheStream documentation
This intermediate iterator operation will be performed locally and remotely requiring possibly a subset of all elements to be in memory
Copyright © 2015 JBoss, a division of Red Hat. All rights reserved.