|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectit.unimi.dsi.mg4j.document.AbstractDocument
it.unimi.dsi.mg4j.document.HtmlDocumentFactory.HtmlDocument
protected class HtmlDocumentFactory.HtmlDocument
An HTML document. If a TITLE element is available, it will be used for title()
instead of the default value.
We delay the actual parsing until it is actually necessary, so operations like getting the document URI will not require parsing.
| Constructor Summary | |
|---|---|
protected |
HtmlDocumentFactory.HtmlDocument(InputStream rawContent,
Reference2ObjectMap<Enum<?>,Object> metadata)
|
| Method Summary | |
|---|---|
Object |
content(int field)
Returns the content of the given field. |
CharSequence |
title()
The title of this document. |
String |
toString()
|
CharSequence |
uri()
A URI that is associated with this document. |
WordReader |
wordReader(int field)
Returns a word reader for the given DocumentFactory.FieldType.TEXT field. |
| Methods inherited from class it.unimi.dsi.mg4j.document.AbstractDocument |
|---|
close, finalize |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
protected HtmlDocumentFactory.HtmlDocument(InputStream rawContent,
Reference2ObjectMap<Enum<?>,Object> metadata)
| Method Detail |
|---|
public CharSequence title()
Document
null.public String toString()
toString in class AbstractDocumentpublic CharSequence uri()
Document
null.
public Object content(int field)
throws IOException
Document
field - the field index.
DocumentFactory that
built this document. For example, the returned object is going to be a Reader if the field type is
DocumentFactory.FieldType.TEXT.
TODO complete this description!!!
IOExceptionpublic WordReader wordReader(int field)
DocumentDocumentFactory.FieldType.TEXT field.
field - the field index.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||