public class WikipediaDocnoMapping extends Object implements DocnoMapping
Provides a mapping between Wikipedia internal ids (docids) and sequentially-numbered ints (docnos).
The main of this class provides a simple program for accessing docno mappings.
Command-line arguments are as follows:
DocnoMapping.Builder, DocnoMapping.BuilderUtils, DocnoMapping.DefaultBuilderOptions| Constructor and Description |
|---|
WikipediaDocnoMapping()
Creates a
WikipediaDocnoMapping object |
| Modifier and Type | Method and Description |
|---|---|
DocnoMapping.Builder |
getBuilder()
Returns the builder for this mapping.
|
String |
getDocid(int docno)
Returns the docid for a particular docno.
|
int |
getDocno(String docid)
Returns the docno for a particular docid.
|
void |
loadMapping(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
Loads a mapping file.
|
static void |
main(String[] args)
Simple program the provides access to the docno/docid mappings.
|
static int[] |
readDocnoMappingData(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
Reads a mappings file into memory.
|
static void |
writeDocnoMappingData(org.apache.hadoop.fs.FileSystem fs,
String inputFile,
int n,
String outputFile)
Creates a mappings file from the contents of a flat text file containing docid to docno
mappings.
|
public WikipediaDocnoMapping()
WikipediaDocnoMapping objectpublic int getDocno(String docid)
DocnoMappinggetDocno in interface DocnoMappingdocid - the docidpublic String getDocid(int docno)
DocnoMappinggetDocid in interface DocnoMappingdocno - the docnopublic void loadMapping(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
throws IOException
DocnoMappingloadMapping in interface DocnoMappingp - path to the mappings filefs - reference to the FileSystemIOExceptionpublic static void writeDocnoMappingData(org.apache.hadoop.fs.FileSystem fs,
String inputFile,
int n,
String outputFile)
throws IOException
WikipediaDocnoMappingBuilder internally.inputFile - flat text file containing docid to docno mappingsoutputFile - output mappings fileIOExceptionpublic static int[] readDocnoMappingData(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
throws IOException
p - path to the mappings filefs - appropriate FileSystemIOExceptionpublic DocnoMapping.Builder getBuilder()
DocnoMappinggetBuilder in interface DocnoMappingpublic static void main(String[] args) throws IOException
args - command-line argumentsIOExceptionCopyright © 2015. All rights reserved.