public class TrecWebDocument extends WebDocument
| Modifier and Type | Field and Description |
|---|---|
static String |
XML_END_TAG
End delimiter of the document, which is <
/DOC>. |
static String |
XML_START_TAG
Start delimiter of the document, which is <
DOC>. |
| Constructor and Description |
|---|
TrecWebDocument()
Creates an empty
Doc2Document object. |
| Modifier and Type | Method and Description |
|---|---|
String |
getContent()
Returns the content of this Gov2 document.
|
String |
getDocid()
Returns the docid of this Gov2 document.
|
String |
getURL() |
static void |
readDocument(TrecWebDocument doc,
String s)
Reads a raw XML string into a
TrecWebDocument object. |
void |
readFields(DataInput in)
Serializes this object.
|
static boolean |
readNextTrecWebDocument(TrecWebDocument doc,
DataInputStream stream) |
void |
write(DataOutput out)
Deserializes this object.
|
getDisplayContent, getDisplayContentTypepublic static final String XML_START_TAG
DOC>.public static final String XML_END_TAG
/DOC>.public void write(DataOutput out) throws IOException
IOExceptionpublic void readFields(DataInput in) throws IOException
IOExceptionpublic String getDocid()
public String getContent()
getContent in class Indexablepublic String getURL()
getURL in class WebDocumentpublic static void readDocument(TrecWebDocument doc, String s)
TrecWebDocument object.doc - the TrecWebDocument objects - raw XML stringpublic static boolean readNextTrecWebDocument(TrecWebDocument doc, DataInputStream stream) throws IOException
IOExceptionCopyright © 2015. All rights reserved.