public class GridGgfsGroupDataBlocksKeyMapper
extends org.gridgain.grid.kernal.processors.cache.GridCacheDefaultAffinityKeyMapper
GGFS class providing ability to group file's data blocks together on one node.
All blocks within the same group are guaranteed to be cached together on the same node.
Group size parameter controls how many sequential blocks will be cached together on the same node.
For example, if block size is 64kb and group size is 256, then each group will contain
64kb * 256 = 16Mb. Larger group sizes would reduce number of splits required to run map-reduce
tasks, but will increase inequality of data size being stored on different nodes.
Note that groupSize() parameter must correlate to Hadoop split size parameter defined
in Hadoop via mapred.max.split.size property. Ideally you want all blocks accessed
within one split to be mapped to 1 group, so they can be located on the same grid node.
For example, default Hadoop split size is 64mb and default GGFS block size
is 64kb. This means that to make sure that each split goes only through blocks on
the same node (without hopping between nodes over network), we have to make the groupSize()
value be equal to 64mb / 64kb = 1024.
It is required for GGFS data cache to be configured with this mapper. Here is an
example of how it can be specified in XML configuration:
<bean id="cacheCfgBase" class="org.gridgain.grid.cache.GridCacheConfiguration" abstract="true">
...
<property name="affinityMapper">
<bean class="org.gridgain.grid.ggfs.GridGgfsGroupDataBlocksKeyMapper">
<!-- How many sequential blocks will be stored on the same node. -->
<constructor-arg value="512"/>
</bean>
</property>
...
</bean>
| Constructor and Description |
|---|
GridGgfsGroupDataBlocksKeyMapper(int grpSize)
Constructs affinity mapper to group several data blocks with the same key.
|
public GridGgfsGroupDataBlocksKeyMapper(int grpSize)
grpSize - Size of the group in blocks.public Object affinityKey(Object key)
affinityKey in interface org.gridgain.grid.cache.affinity.GridCacheAffinityKeyMapperaffinityKey in class org.gridgain.grid.kernal.processors.cache.GridCacheDefaultAffinityKeyMapperpublic int groupSize()
Copyright © 2014. All rights reserved.