org.apache.hadoop.hdfs.server.blockmanagement
Class BlockPlacementPolicyWithNodeGroup

java.lang.Object
  extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
      extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
          extended by org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithNodeGroup

public class BlockPlacementPolicyWithNodeGroup
extends org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault

The class is responsible for choosing the desired number of targets for placing block replicas on environment with node-group layer. The replica placement strategy is adjusted to: If the writer is on a datanode, the 1st replica is placed on the local node (or local node-group), otherwise a random datanode. The 2nd replica is placed on a datanode that is on a different rack with 1st replica node. The 3rd replica is placed on a datanode which is on a different node-group but the same rack as the second replica node.


Field Summary
 
Fields inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
clusterMap, considerLoad, heartbeatInterval, threadLocalBuilder, tolerateHeartbeatMultiplier
 
Method Summary
protected  int addToExcludedNodes(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine, HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes)
          Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroup
protected  void adjustExcludedNodes(HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes, org.apache.hadoop.net.Node chosenNode)
          After choosing a node to place replica, adjust excluded nodes accordingly.
protected  org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chooseLocalNode(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine, HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results, boolean avoidStaleNodes)
          choose local node of localMachine as the target.
protected  org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chooseLocalRack(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine, HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxNodesPerRack, List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results, boolean avoidStaleNodes)
          
protected  void chooseRemoteRack(int numOfReplicas, org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine, HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes, long blocksize, int maxReplicasPerRack, List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results, boolean avoidStaleNodes)
          
protected  String getRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo cur)
          Get rack string from a data node
 void initialize(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats, org.apache.hadoop.net.NetworkTopology clusterMap)
          Used to setup a BlockPlacementPolicy object.
 Iterator<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> pickupReplicaSet(Collection<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> first, Collection<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> second)
          Pick up replica node set for deleting replica as over-replicated.
 
Methods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
chooseRandom, chooseRandom, chooseReplicaToDelete, chooseTarget, chooseTarget, isGoodTarget, verifyBlockPlacement
 
Methods inherited from class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
adjustSetsWithChosenReplica, getInstance, splitNodesWithRack
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

initialize

public void initialize(org.apache.hadoop.conf.Configuration conf,
                       org.apache.hadoop.hdfs.server.namenode.FSClusterStats stats,
                       org.apache.hadoop.net.NetworkTopology clusterMap)
Description copied from class: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Used to setup a BlockPlacementPolicy object. This should be defined by all implementations of a BlockPlacementPolicy.

Overrides:
initialize in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Parameters:
conf - the configuration object
stats - retrieve cluster status from here
clusterMap - cluster topology

chooseLocalNode

protected org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chooseLocalNode(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine,
                                                                                           HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes,
                                                                                           long blocksize,
                                                                                           int maxNodesPerRack,
                                                                                           List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results,
                                                                                           boolean avoidStaleNodes)
                                                                                    throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException
choose local node of localMachine as the target. if localMachine is not available, choose a node on the same nodegroup or rack instead.

Overrides:
chooseLocalNode in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Returns:
the chosen node
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

adjustExcludedNodes

protected void adjustExcludedNodes(HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes,
                                   org.apache.hadoop.net.Node chosenNode)
After choosing a node to place replica, adjust excluded nodes accordingly. It should do nothing here as chosenNode is already put into exlcudeNodes, but it can be overridden in subclass to put more related nodes into excludedNodes.

Overrides:
adjustExcludedNodes in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault

chooseLocalRack

protected org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor chooseLocalRack(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine,
                                                                                           HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes,
                                                                                           long blocksize,
                                                                                           int maxNodesPerRack,
                                                                                           List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results,
                                                                                           boolean avoidStaleNodes)
                                                                                    throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

Overrides:
chooseLocalRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

chooseRemoteRack

protected void chooseRemoteRack(int numOfReplicas,
                                org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine,
                                HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes,
                                long blocksize,
                                int maxReplicasPerRack,
                                List<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> results,
                                boolean avoidStaleNodes)
                         throws org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

Overrides:
chooseRemoteRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Throws:
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.NotEnoughReplicasException

getRack

protected String getRack(org.apache.hadoop.hdfs.protocol.DatanodeInfo cur)
Description copied from class: org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Get rack string from a data node

Overrides:
getRack in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
Returns:
rack of data node

addToExcludedNodes

protected int addToExcludedNodes(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor localMachine,
                                 HashMap<org.apache.hadoop.net.Node,org.apache.hadoop.net.Node> excludedNodes)
Find other nodes in the same nodegroup of localMachine and add them into excludeNodes as replica should not be duplicated for nodes within the same nodegroup

Overrides:
addToExcludedNodes in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault
Returns:
number of new excluded nodes

pickupReplicaSet

public Iterator<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> pickupReplicaSet(Collection<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> first,
                                                                                                   Collection<org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor> second)
Pick up replica node set for deleting replica as over-replicated. First set contains replica nodes on rack with more than one replica while second set contains remaining replica nodes. If first is not empty, divide first set into two subsets: moreThanOne contains nodes on nodegroup with more than one replica exactlyOne contains the remaining nodes in first set then pickup priSet if not empty. If first is empty, then pick second.

Overrides:
pickupReplicaSet in class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault


Copyright © 2013 Apache Software Foundation. All Rights Reserved.