|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.apache.manifoldcf.crawler.connectors.rss.RSSConnector.Filter
protected static class RSSConnector.Filter
Class that handles parsing and interpretation of the document specification. Note that I believe it to be faster to do this once, gathering all the data, than to scan the document specification multiple times. Therefore, this class contains the *entire* interpreted set of data from a document specification.
| Field Summary | |
|---|---|
protected java.util.HashMap |
acls
|
protected java.lang.Integer |
badFeedRescanInterval
|
protected RSSConnector.CanonicalizationPolicies |
canonicalizationPolicies
|
protected int |
chromedContentMode
|
protected int |
dechromedContentMode
|
protected java.lang.Integer |
defaultRescanInterval
|
protected int |
feedTimeoutValue
|
protected RSSConnector.MappingRules |
mappings
|
protected java.util.ArrayList |
metadata
|
protected java.lang.Integer |
minimumRescanInterval
|
protected java.util.HashMap |
seeds
|
| Constructor Summary | |
|---|---|
RSSConnector.Filter(org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
boolean warnOnBadSeed)
Constructor. |
|
| Method Summary | |
|---|---|
java.lang.String[] |
getAcls()
Get the acls |
java.lang.Long |
getBadFeedRescanTime(long currentTime)
Get the next time a "bad feed" should be rescanned |
RSSConnector.CanonicalizationPolicies |
getCanonicalizationPolicies()
Get canonicalization policies |
int |
getChromedContentMode()
Get the chromed content mode |
int |
getDechromedContentMode()
Get the dechromed content mode |
java.lang.Long |
getDefaultRescanTime(long currentTime)
Get the next time (by default) a feed should be scanned |
int |
getFeedTimeoutValue()
Get the feed timeout value |
java.util.ArrayList |
getMetadata()
Get the specified metadata |
java.lang.Long |
getMinimumRescanTime(long currentTime)
Get the minimum next time a feed should be scanned |
java.util.Iterator |
getSeeds()
Iterate over all canonicalized seeds |
boolean |
isLegalURL(java.lang.String url)
Check for legality of a url. |
boolean |
isSeed(java.lang.String canonicalUrl)
Check if document is a seed |
java.lang.String |
mapDocumentURL(java.lang.String url)
Scan patterns and return the one that matches first. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected RSSConnector.MappingRules mappings
protected java.util.HashMap seeds
protected java.lang.Integer defaultRescanInterval
protected java.lang.Integer minimumRescanInterval
protected java.lang.Integer badFeedRescanInterval
protected int dechromedContentMode
protected int chromedContentMode
protected int feedTimeoutValue
protected java.util.ArrayList metadata
protected java.util.HashMap acls
protected RSSConnector.CanonicalizationPolicies canonicalizationPolicies
| Constructor Detail |
|---|
public RSSConnector.Filter(org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
boolean warnOnBadSeed)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFException| Method Detail |
|---|
public boolean isSeed(java.lang.String canonicalUrl)
public java.util.Iterator getSeeds()
public java.util.ArrayList getMetadata()
public java.lang.String[] getAcls()
public int getFeedTimeoutValue()
public int getDechromedContentMode()
public int getChromedContentMode()
public java.lang.Long getDefaultRescanTime(long currentTime)
public java.lang.Long getMinimumRescanTime(long currentTime)
public java.lang.Long getBadFeedRescanTime(long currentTime)
public boolean isLegalURL(java.lang.String url)
public java.lang.String mapDocumentURL(java.lang.String url)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic RSSConnector.CanonicalizationPolicies getCanonicalizationPolicies()
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||