org.apache.hadoop.tools
Class DistCp

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.tools.DistCp
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class DistCp
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

DistCp is the main driver-class for DistCpV2. For command-line use, DistCp::main() orchestrates the parsing of command-line parameters and the launch of the DistCp job. For programmatic use, a DistCp object can be constructed by specifying options (in a DistCpOptions object), and DistCp::execute() may be used to launch the copy-job. DistCp may alternatively be sub-classed to fine-tune behaviour.


Field Summary
static Random rand
           
 
Constructor Summary
DistCp(org.apache.hadoop.conf.Configuration configuration, DistCpOptions inputOptions)
          Public Constructor.
 
Method Summary
 org.apache.hadoop.mapreduce.Job execute()
          Implements the core-execution.
static void main(String[] argv)
          Main function of the DistCp program.
 int run(String[] argv)
          Implementation of Tool::run().
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

rand

public static final Random rand
Constructor Detail

DistCp

public DistCp(org.apache.hadoop.conf.Configuration configuration,
              DistCpOptions inputOptions)
       throws Exception
Public Constructor. Creates DistCp object with specified input-parameters. (E.g. source-paths, target-location, etc.)

Parameters:
inputOptions - Options (indicating source-paths, target-location.)
configuration - The Hadoop configuration against which the Copy-mapper must run.
Throws:
Exception, - on failure.
Exception
Method Detail

run

public int run(String[] argv)
Implementation of Tool::run(). Orchestrates the copy of source file(s) to target location, by: 1. Creating a list of files to be copied to target. 2. Launching a Map-only job to copy the files. (Delegates to execute().)

Specified by:
run in interface org.apache.hadoop.util.Tool
Parameters:
argv - List of arguments passed to DistCp, from the ToolRunner.
Returns:
On success, it returns 0. Else, -1.

execute

public org.apache.hadoop.mapreduce.Job execute()
                                        throws Exception
Implements the core-execution. Creates the file-list for copy, and launches the Hadoop-job, to do the copy.

Returns:
Job handle
Throws:
Exception, - on failure.
Exception

main

public static void main(String[] argv)
Main function of the DistCp program. Parses the input arguments (via OptionsParser), and invokes the DistCp::run() method, via the ToolRunner.

Parameters:
argv - Command-line arguments sent to DistCp.


Copyright © 2012 Apache Software Foundation. All Rights Reserved.