public class ClueWarcDocnoMapping extends Object implements DocnoMapping
Object that maps between WARC-TREC-IDs (String identifiers) to docnos (sequentially-numbered ints). This object provides mappings for the Clue Web English collection; the docnos are numbered from part 1 all the way through part 10.
Note that this class needs the data file docno.mapping,
loaded via the loadMapping(Path, FileSystem) method.
DocnoMapping.Builder, DocnoMapping.BuilderUtils, DocnoMapping.DefaultBuilderOptions| Constructor and Description |
|---|
ClueWarcDocnoMapping()
Creates a
ClueWarcDocnoMapping object |
| Modifier and Type | Method and Description |
|---|---|
DocnoMapping.Builder |
getBuilder()
Returns the builder for this mapping.
|
String |
getDocid(int docno)
Returns the docid for a particular docno.
|
int |
getDocno(String docid)
Returns the docno for a particular docid.
|
void |
loadMapping(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
Loads a mapping file.
|
static void |
main(String[] args)
Simple program the provides access to the docno/docid mappings.
|
public ClueWarcDocnoMapping()
ClueWarcDocnoMapping objectpublic int getDocno(String docid)
DocnoMappinggetDocno in interface DocnoMappingdocid - the docidpublic String getDocid(int docno)
DocnoMappinggetDocid in interface DocnoMappingdocno - the docnopublic void loadMapping(org.apache.hadoop.fs.Path p,
org.apache.hadoop.fs.FileSystem fs)
throws IOException
DocnoMappingloadMapping in interface DocnoMappingp - path to the mappings filefs - reference to the FileSystemIOExceptionpublic DocnoMapping.Builder getBuilder()
DocnoMappinggetBuilder in interface DocnoMappingpublic static void main(String[] args) throws IOException
IOExceptionCopyright © 2015. All rights reserved.