public final class ImageExtractor extends Object
| Modifier and Type | Field and Description |
|---|---|
static ImageExtractor |
INSTANCE |
| Modifier and Type | Method and Description |
|---|---|
static ImageExtractor |
getInstance()
Returns the singleton instance of
ImageExtractor. |
List<Image> |
process(TextDocument doc,
InputSource is)
Processes the given
TextDocument and the original HTML text (as an
InputSource). |
List<Image> |
process(TextDocument doc,
String origHTML)
Processes the given
TextDocument and the original HTML text (as a
String). |
List<Image> |
process(URL url,
BoilerpipeExtractor extractor)
Fetches the given
URL using HTMLFetcher and processes the
retrieved HTML using the specified BoilerpipeExtractor. |
public static final ImageExtractor INSTANCE
public static ImageExtractor getInstance()
ImageExtractor.ImageExtractor.public List<Image> process(TextDocument doc, String origHTML) throws BoilerpipeProcessingException
TextDocument and the original HTML text (as a
String).doc - The processed TextDocument.origHTML - The original HTML document.ImagesBoilerpipeProcessingExceptionpublic List<Image> process(TextDocument doc, InputSource is) throws BoilerpipeProcessingException
TextDocument and the original HTML text (as an
InputSource).doc - The processed TextDocument.
The original HTML document.ImagesBoilerpipeProcessingExceptionpublic List<Image> process(URL url, BoilerpipeExtractor extractor) throws IOException, BoilerpipeProcessingException, SAXException
URL using HTMLFetcher and processes the
retrieved HTML using the specified BoilerpipeExtractor.
The processed TextDocument.
The original HTML document.ImagesBoilerpipeProcessingExceptionIOExceptionSAXExceptionCopyright © 2013-2014. All Rights Reserved.