class JsoupBrowser extends Browser
A Browser implementation based on jsoup, a Java HTML parser library. JsoupBrowser
provides powerful and efficient document querying, but it doesn't run JavaScript in the pages. As such, it is
limited to working strictly with the HTML send in the page source.
Currently, JsoupBrowser does not keep separate cookie stores for different domains and paths. In each request all
cookies set previously will be sent, regardless of the domain they were set on. If you do requests to different
domains and do not want this behavior, use different JsoupBrowser instances.
As the documents parsed by JsoupBrowser instances are not changed after loading, Document and Element
instances obtained from them are guaranteed to be immutable.
- Alphabetic
- By Inheritance
- JsoupBrowser
- Browser
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new JsoupBrowser(userAgent: String = "jsoup/1.8", proxy: java.net.Proxy = null)
- userAgent
the user agent with which requests should be made
- proxy
an optional proxy configuration to use
Type Members
- type DocumentType = JsoupDocument
The concrete type of documents created by this browser.
The concrete type of documents created by this browser.
- Definition Classes
- JsoupBrowser → Browser
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clearCookies(): Unit
Clears the cookie store of this browser.
Clears the cookie store of this browser.
- Definition Classes
- JsoupBrowser → Browser
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @HotSpotIntrinsicCandidate() @native()
- def cookies(url: String): Map[String, String]
Returns the current set of cookies stored in this browser for a given URL.
Returns the current set of cookies stored in this browser for a given URL.
- url
the URL whose stored cookies are to be returned
- returns
a mapping of cookie names to their respective values.
- Definition Classes
- JsoupBrowser → Browser
- def defaultRequestSettings(conn: Connection): Connection
- Attributes
- protected[this]
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def executeRequest(conn: Connection): Response
- Attributes
- protected[this]
- def get(url: String): JsoupDocument
Retrieves and parses a web page using a GET request.
Retrieves and parses a web page using a GET request.
- url
the URL of the page to retrieve
- returns
a
Documentcontaining the retrieved web page.
- Definition Classes
- JsoupBrowser → Browser
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def parseFile(file: File, charset: String): JsoupDocument
Parses a local HTML file with a specified charset.
Parses a local HTML file with a specified charset.
- file
the HTML file to parse
- charset
the charset of the file
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- JsoupBrowser → Browser
- def parseFile(path: String): DocumentType
Parses a local HTML file encoded in UTF-8.
Parses a local HTML file encoded in UTF-8.
- path
the path in the local filesystem where the HTML file is located
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseFile(path: String, charset: String): DocumentType
Parses a local HTML file with a specified charset.
Parses a local HTML file with a specified charset.
- path
the path in the local filesystem where the HTML file is located
- charset
the charset of the file
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseFile(file: File): DocumentType
Parses a local HTML file encoded in UTF-8.
Parses a local HTML file encoded in UTF-8.
- file
the HTML file to parse
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseInputStream(inputStream: InputStream, charset: String): JsoupDocument
Parses an input stream with its content in a specified charset.
Parses an input stream with its content in a specified charset. The provided input stream is always closed before this method returns or throws an exception.
- inputStream
the input stream to parse
- charset
the charset of the input stream content
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- JsoupBrowser → Browser
- def parseResource(name: String, charset: String = "UTF-8"): DocumentType
Parses a resource with a specified charset.
Parses a resource with a specified charset.
- name
the name of the resource to parse
- charset
the charset of the resource
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseString(html: String): JsoupDocument
Parses an HTML string.
Parses an HTML string.
- html
the HTML string to parse
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- JsoupBrowser → Browser
- def post(url: String, form: Map[String, String]): JsoupDocument
Submits a form via a POST request and parses the resulting page.
Submits a form via a POST request and parses the resulting page.
- url
the URL of the page to retrieve
- form
a map containing the form fields to submit with their respective values
- returns
a
Documentcontaining the resulting web page.
- Definition Classes
- JsoupBrowser → Browser
- def processResponse(res: Response): JsoupDocument
- Attributes
- protected[this]
- val proxy: java.net.Proxy
- def requestSettings(conn: Connection): Connection
- def setCookie(url: String, key: String, value: String): Map[String, String]
- def setCookies(url: String, m: Map[String, String]): Map[String, String]
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- val userAgent: String
The user agent used by this browser to retrieve HTML pages from the web.
The user agent used by this browser to retrieve HTML pages from the web.
- Definition Classes
- JsoupBrowser → Browser
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- def withProxy(proxy: Proxy): JsoupBrowser
Returns a new browser that uses the provided proxy for all connections.
Returns a new browser that uses the provided proxy for all connections.
- Definition Classes
- JsoupBrowser → Browser
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)