| Interface | Description |
|---|---|
| HtmlParseFilter |
Extension point for DOM-based HTML parsers.
|
| Parse |
The result of parsing a page's raw content.
|
| Parser |
A parser for content generated by a
Protocol
implementation. |
| Class | Description |
|---|---|
| HTMLMetaTags |
This class holds the information about HTML "meta" tags extracted from
a page.
|
| HtmlParseFilters |
Creates and caches
HtmlParseFilter implementing plugins. |
| MetaTagsParser |
Parse HTML meta tags (keywords, description) and store them in the parse metadata so that
they can be indexed with the index-metadata plugin with the prefix 'metatag.'
|
| Outlink | |
| OutlinkExtractor |
Extractor to extract
Outlinks
/ URLs from plain text using Regular Expressions. |
| ParseData |
Data extracted from a page's content.
|
| ParseImpl |
The result of parsing a page's raw content.
|
| ParseOutputFormat | |
| ParserChecker |
Parser checker, useful for testing parser.
|
| ParseResult |
A utility class that stores result of a parse.
|
| ParserFactory |
Creates and caches
Parser plugins. |
| ParseSegment | |
| ParseStatus | |
| ParseText | |
| ParseUtil |
| Exception | Description |
|---|---|
| ParseException | |
| ParserNotFound |
Copyright © 2014 The Apache Software Foundation