org.apache.manifoldcf.crawler.connectors.webcrawler
Class RobotsManager.RobotsData

java.lang.Object
  extended by org.apache.manifoldcf.crawler.connectors.webcrawler.RobotsManager.RobotsData
Enclosing class:
RobotsManager

protected static class RobotsManager.RobotsData
extends java.lang.Object

This is a cached data item.


Field Summary
protected  long expiration
           
protected  java.util.ArrayList records
           
 
Constructor Summary
RobotsManager.RobotsData(java.io.InputStream is, long expiration, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IVersionActivity activities)
          Constructor.
 
Method Summary
 long getExpirationTime()
          Get expiration
 boolean isFetchAllowed(java.lang.String userAgent, java.lang.String pathString)
          Check if fetch is allowed
protected  void parseRobotsTxt(java.io.BufferedReader r, java.lang.String hostName, org.apache.manifoldcf.crawler.interfaces.IVersionActivity activities)
          Parse the robots.txt file using a reader.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

expiration

protected long expiration

records

protected java.util.ArrayList records
Constructor Detail

RobotsManager.RobotsData

public RobotsManager.RobotsData(java.io.InputStream is,
                                long expiration,
                                java.lang.String hostName,
                                org.apache.manifoldcf.crawler.interfaces.IVersionActivity activities)
                         throws java.io.IOException,
                                org.apache.manifoldcf.core.interfaces.ManifoldCFException
Constructor.

Throws:
java.io.IOException
org.apache.manifoldcf.core.interfaces.ManifoldCFException
Method Detail

isFetchAllowed

public boolean isFetchAllowed(java.lang.String userAgent,
                              java.lang.String pathString)
Check if fetch is allowed


getExpirationTime

public long getExpirationTime()
Get expiration


parseRobotsTxt

protected void parseRobotsTxt(java.io.BufferedReader r,
                              java.lang.String hostName,
                              org.apache.manifoldcf.crawler.interfaces.IVersionActivity activities)
                       throws java.io.IOException,
                              org.apache.manifoldcf.core.interfaces.ManifoldCFException
Parse the robots.txt file using a reader. Is NOT expected to close the stream.

Throws:
java.io.IOException
org.apache.manifoldcf.core.interfaces.ManifoldCFException