| Package | Description |
|---|---|
| de.l3s.boilerpipe.document |
The classes in this package represent the simple Boilerpipe
document model.
|
| de.l3s.boilerpipe.sax |
Classes related to parsing and producing HTML from/to Boilerpipe
TextDocuments.
|
| Modifier and Type | Class and Description |
|---|---|
class |
Image
Represents an Image resource that is contained in the document.
|
class |
Video
Represents an video resource which is contained in the document.
|
class |
VimeoVideo
Represents an Vimeo video resource that is contained in the document.
|
class |
YoutubeVideo
Represents an Youtube video resource that is contained in the document.
|
| Modifier and Type | Method and Description |
|---|---|
List<Media> |
MediaExtractor.process(String doc,
BoilerpipeExtractor extractor)
parses the media (picture, video) out of doc
|
List<Media> |
MediaExtractor.process(TextDocument doc,
InputSource is)
Processes the given
TextDocument and the original HTML text (as an InputSource). |
List<Media> |
MediaExtractor.process(TextDocument doc,
String origHTML)
Processes the given
TextDocument and the original HTML text (as a String). |
List<Media> |
MediaExtractor.process(URL url,
BoilerpipeExtractor extractor)
Fetches the given
URL using HTMLFetcher and processes the retrieved HTML using the specified
BoilerpipeExtractor. |
Copyright © 2013-2014. All Rights Reserved.