class HtmlUnitBrowser extends Browser
A Browser implementation based on HtmlUnit, a GUI-less browser for Java
programs. HtmlUnitBrowser simulates thoroughly a web browser, executing JavaScript code in the pages besides
parsing and modelling its HTML content. It supports several compatibility modes, allowing it to emulate browsers
such as Internet Explorer.
Both the net.ruippeixotog.scalascraper.model.Document and the net.ruippeixotog.scalascraper.model.Element
instances obtained from HtmlUnitBrowser can be mutated in the background. JavaScript code can at any time change
attributes and the content of elements, reflected both in queries to Document and on previously stored references
to Elements. The Document instance will always represent the current page in the browser's "window". This means
the Document's location value can change, together with its root element, in the event of client-side page
refreshes or redirections. However, Element instances belong to a fixed DOM tree and they stop being meaningful as
soon as they are removed from the DOM or a client-side page reload occurs.
- Alphabetic
- By Inheritance
- HtmlUnitBrowser
- Browser
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new HtmlUnitBrowser(browserType: BrowserVersion = BrowserVersion.CHROME, proxy: Option[ProxyConfig] = None)
- browserType
the browser type and version to simulate
- proxy
an optional proxy configuration to use
Type Members
- type DocumentType = HtmlUnitDocument
The concrete type of documents created by this browser.
The concrete type of documents created by this browser.
- Definition Classes
- HtmlUnitBrowser → Browser
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clearCookies(): Unit
Clears the cookie store of this browser.
Clears the cookie store of this browser.
- Definition Classes
- HtmlUnitBrowser → Browser
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @HotSpotIntrinsicCandidate() @native()
- def closeAll(): Unit
Closes all windows opened in this browser.
- def cookies(url: String): Map[String, String]
Returns the current set of cookies stored in this browser for a given URL.
Returns the current set of cookies stored in this browser for a given URL.
- url
the URL whose stored cookies are to be returned
- returns
a mapping of cookie names to their respective values.
- Definition Classes
- HtmlUnitBrowser → Browser
- def defaultClientSettings(client: WebClient): Unit
- Attributes
- protected[this]
- def defaultRequestSettings(req: WebRequest): Unit
- Attributes
- protected[this]
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def exec(req: WebRequest): HtmlUnitDocument
- def get(url: String): HtmlUnitDocument
Retrieves and parses a web page using a GET request.
Retrieves and parses a web page using a GET request.
- url
the URL of the page to retrieve
- returns
a
Documentcontaining the retrieved web page.
- Definition Classes
- HtmlUnitBrowser → Browser
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def parseFile(file: File, charset: String): HtmlUnitDocument
Parses a local HTML file with a specified charset.
Parses a local HTML file with a specified charset.
- file
the HTML file to parse
- charset
the charset of the file
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- HtmlUnitBrowser → Browser
- def parseFile(path: String): DocumentType
Parses a local HTML file encoded in UTF-8.
Parses a local HTML file encoded in UTF-8.
- path
the path in the local filesystem where the HTML file is located
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseFile(path: String, charset: String): DocumentType
Parses a local HTML file with a specified charset.
Parses a local HTML file with a specified charset.
- path
the path in the local filesystem where the HTML file is located
- charset
the charset of the file
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseFile(file: File): DocumentType
Parses a local HTML file encoded in UTF-8.
Parses a local HTML file encoded in UTF-8.
- file
the HTML file to parse
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseInputStream(inputStream: InputStream, charset: String): HtmlUnitDocument
Parses an input stream with its content in a specified charset.
Parses an input stream with its content in a specified charset. The provided input stream is always closed before this method returns or throws an exception.
- inputStream
the input stream to parse
- charset
the charset of the input stream content
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- HtmlUnitBrowser → Browser
- def parseResource(name: String, charset: String = "UTF-8"): DocumentType
Parses a resource with a specified charset.
Parses a resource with a specified charset.
- name
the name of the resource to parse
- charset
the charset of the resource
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- Browser
- def parseString(html: String): HtmlUnitDocument
Parses an HTML string.
Parses an HTML string.
- html
the HTML string to parse
- returns
a
Documentcontaining the parsed web page.
- Definition Classes
- HtmlUnitBrowser → Browser
- def post(url: String, form: Map[String, String]): HtmlUnitDocument
Submits a form via a POST request and parses the resulting page.
Submits a form via a POST request and parses the resulting page.
- url
the URL of the page to retrieve
- form
a map containing the form fields to submit with their respective values
- returns
a
Documentcontaining the resulting web page.
- Definition Classes
- HtmlUnitBrowser → Browser
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- lazy val underlying: WebClient
- def userAgent: String
The user agent used by this browser to retrieve HTML pages from the web.
The user agent used by this browser to retrieve HTML pages from the web.
- Definition Classes
- HtmlUnitBrowser → Browser
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- def withProxy(proxy: Proxy): HtmlUnitBrowser
Returns a new browser that uses the provided proxy for all connections.
Returns a new browser that uses the provided proxy for all connections.
- Definition Classes
- HtmlUnitBrowser → Browser
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)