public class XMLInputFormat
extends org.apache.hadoop.mapreduce.lib.input.TextInputFormat
InputFormat for XML documents (org.apache.hadoop.mapreduce API). The class recognizes begin-of-document and end-of-document
tags only: everything between those delimiting tags is returned in an uninterpreted Text
object.| Modifier and Type | Class and Description |
|---|---|
static class |
XMLInputFormat.XMLRecordReader
Simple
RecordReader for XML documents (org.apache.hadoop.mapreduce API). |
| Modifier and Type | Field and Description |
|---|---|
static String |
END_TAG_KEY |
static String |
START_TAG_KEY |
| Constructor and Description |
|---|
XMLInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Create a record reader for a given split.
|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSizepublic static final String START_TAG_KEY
public static final String END_TAG_KEY
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
RecordReader.initialize(InputSplit, TaskAttemptContext) before
the split is used.createRecordReader in class org.apache.hadoop.mapreduce.lib.input.TextInputFormatsplit - the split to be readcontext - the information about the taskIOExceptionInterruptedExceptionCopyright © 2015. All rights reserved.