|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.twitter.Extractor
public class Extractor
A class to extract usernames, lists, hashtags and URLs from Tweet text.
| Nested Class Summary | |
|---|---|
static class |
Extractor.Entity
|
| Field Summary | |
|---|---|
protected boolean |
extractURLWithoutProtocol
|
| Constructor Summary | |
|---|---|
Extractor()
Create a new extractor. |
|
| Method Summary | |
|---|---|
List<String> |
extractCashtags(String text)
Extract $cashtag references from Tweet text. |
List<Extractor.Entity> |
extractCashtagsWithIndices(String text)
Extract $cashtag references from Tweet text. |
List<Extractor.Entity> |
extractEntitiesWithIndices(String text)
Extract URLs, @mentions, lists and #hashtag from a given text/tweet. |
List<String> |
extractHashtags(String text)
Extract #hashtag references from Tweet text. |
List<Extractor.Entity> |
extractHashtagsWithIndices(String text)
Extract #hashtag references from Tweet text. |
List<String> |
extractMentionedScreennames(String text)
Extract @username references from Tweet text. |
List<Extractor.Entity> |
extractMentionedScreennamesWithIndices(String text)
Extract @username references from Tweet text. |
List<Extractor.Entity> |
extractMentionsOrListsWithIndices(String text)
|
String |
extractReplyScreenname(String text)
Extract a @username reference from the beginning of Tweet text. |
List<String> |
extractURLs(String text)
Extract URL references from Tweet text. |
List<Extractor.Entity> |
extractURLsWithIndices(String text)
Extract URL references from Tweet text. |
boolean |
isExtractURLWithoutProtocol()
|
void |
modifyIndicesFromUnicodeToUTF16(String text,
List<Extractor.Entity> entities)
|
void |
modifyIndicesFromUTF16ToToUnicode(String text,
List<Extractor.Entity> entities)
|
void |
setExtractURLWithoutProtocol(boolean extractURLWithoutProtocol)
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected boolean extractURLWithoutProtocol
| Constructor Detail |
|---|
public Extractor()
| Method Detail |
|---|
public List<Extractor.Entity> extractEntitiesWithIndices(String text)
text - text of tweet
public List<String> extractMentionedScreennames(String text)
text - of the tweet from which to extract usernames
public List<Extractor.Entity> extractMentionedScreennamesWithIndices(String text)
text - of the tweet from which to extract usernames
public List<Extractor.Entity> extractMentionsOrListsWithIndices(String text)
public String extractReplyScreenname(String text)
text - of the tweet from which to extract the replied to username
public List<String> extractURLs(String text)
text - of the tweet from which to extract URLs
public List<Extractor.Entity> extractURLsWithIndices(String text)
text - of the tweet from which to extract URLs
public List<String> extractHashtags(String text)
text - of the tweet from which to extract hashtags
public List<Extractor.Entity> extractHashtagsWithIndices(String text)
text - of the tweet from which to extract hashtags
public List<String> extractCashtags(String text)
text - of the tweet from which to extract cashtags
public List<Extractor.Entity> extractCashtagsWithIndices(String text)
text - of the tweet from which to extract cashtags
public void setExtractURLWithoutProtocol(boolean extractURLWithoutProtocol)
public boolean isExtractURLWithoutProtocol()
public void modifyIndicesFromUnicodeToUTF16(String text,
List<Extractor.Entity> entities)
public void modifyIndicesFromUTF16ToToUnicode(String text,
List<Extractor.Entity> entities)
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||