|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.sun.slamd.http.HTMLDocument
public class HTMLDocument
This class defines an HTML document that may be included as part of a response sent by a Web server. It provides methods for performing various operations on the document, including extracting any links or images that it may contain, or retrieving the text of the document.
| Constructor Summary | |
|---|---|
HTMLDocument(java.lang.String documentURL,
java.lang.String htmlData)
Creates a new HTML document using the provided data. |
|
| Method Summary | |
|---|---|
java.lang.String[] |
getAssociatedFiles()
Retrieves an array containing a set of URLs parsed from the HTML document that reference files that would normally be downloaded as part of retrieving a page in a browser. |
java.lang.String[] |
getDocumentFrames()
Retrieves an array containing a set of URLs parsed from the HTML document that reference frames used in the document. |
java.lang.String[] |
getDocumentImages()
Retrieves an array containing a set of URLs parsed from the HTML document that reference images used in the document. |
java.lang.String[] |
getDocumentLinks()
Retrieves an array containing a set of URLs parsed from the HTML document that are in the form of links to other content. |
java.lang.String |
getDocumentURL()
Retrieves the URL of this HTML document. |
java.lang.String |
getHTMLData()
Retrieves the original HTML data used to create this document. |
java.lang.String |
getTextData()
Retrieves the contents of the HTML document with all tags removed. |
boolean |
parse()
Actually parses the HTML document and extracts useful elements from it. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public HTMLDocument(java.lang.String documentURL,
java.lang.String htmlData)
throws java.net.MalformedURLException
documentURL - The URL for this document.htmlData - The actual data contained in the HTML document.
java.net.MalformedURLException - If the provided URL is malformed.| Method Detail |
|---|
public boolean parse()
true if the page could be parsed successfully, or
false if not.public java.lang.String getDocumentURL()
public java.lang.String getHTMLData()
public java.lang.String getTextData()
null if a problem occurs while trying to parse the
HTML.public java.lang.String[] getAssociatedFiles()
null if a problem occurs while
trying to parse the HTML.public java.lang.String[] getDocumentLinks()
null if a problem occurs while trying to parse the
HTML.public java.lang.String[] getDocumentImages()
public java.lang.String[] getDocumentFrames()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||