org.apache.manifoldcf.crawler.connectors.webcrawler
Class WebcrawlerConfig

java.lang.Object
  extended by org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConfig

public class WebcrawlerConfig
extends java.lang.Object

Constants for the Webcrawler connector configuration.


Field Summary
static java.lang.String _rcsid
           
static java.lang.String ATTR_BINREGEXP
          The bin regular expression
static java.lang.String ATTR_DOMAIN
          Domain/realm part of credentials (if any)
static java.lang.String ATTR_INSENSITIVE
          Whether the match is case insensitive
static java.lang.String ATTR_MATCHREGEXP
          Form name or link target regexp for authentication page
static java.lang.String ATTR_NAMEREGEXP
          Authentication parameter name regexp
static java.lang.String ATTR_PASSWORD
          Password part of credentials
static java.lang.String ATTR_TRUSTEVERYTHING
          "Trust everything" attribute - replacing truststore if set to 'true'
static java.lang.String ATTR_TRUSTSTORE
          Trust store section of authentication record
static java.lang.String ATTR_TYPE
          Type of security
static java.lang.String ATTR_URLREGEXP
          Regexp for access control node
static java.lang.String ATTR_USERNAME
          Username part of credentials
static java.lang.String ATTR_VALUE
          The value attribute (used for maxconnections and maxkbpersecond)
static java.lang.String ATTRVALUE_BASIC
          Type value for basic authentication
static java.lang.String ATTRVALUE_FORM
          Authentication page type: Form
static java.lang.String ATTRVALUE_LINK
          Authentication page type: Link
static java.lang.String ATTRVALUE_NTLM
          Type value for NTLM authentication
static java.lang.String ATTRVALUE_REDIRECTION
          Authentication page type: Redirection
static java.lang.String ATTRVALUE_SESSION
          Type value for session-based authentication
static java.lang.String NODE_ACCESSCREDENTIAL
          Access control description node
static java.lang.String NODE_AUTHPAGE
          Authentication page description node
static java.lang.String NODE_AUTHPARAMETER
          Authentication parameter node
static java.lang.String NODE_BINDESC
          The bin description node
static java.lang.String NODE_EXCLUDES
          Exclude regexps node.
static java.lang.String NODE_INCLUDES
          Include regexps node.
static java.lang.String NODE_LIMITTOSEEDS
          Limit to seeds.
static java.lang.String NODE_MAXCONNECTIONS
          The max connections node
static java.lang.String NODE_MAXFETCHESPERMINUTE
          The max fetch rate node
static java.lang.String NODE_MAXKBPERSECOND
          The bandwidth node
static java.lang.String NODE_SEEDS
          The seeds node.
static java.lang.String NODE_TRUST
          Trust store description node
static java.lang.String PARAMETER_EMAIL
          Email (a parameter)
static java.lang.String PARAMETER_ROBOTSUSAGE
          Robots usage (a parameter)
 
Constructor Summary
WebcrawlerConfig()
           
 
Method Summary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_rcsid

public static final java.lang.String _rcsid
See Also:
Constant Field Values

PARAMETER_ROBOTSUSAGE

public static final java.lang.String PARAMETER_ROBOTSUSAGE
Robots usage (a parameter)

See Also:
Constant Field Values

PARAMETER_EMAIL

public static final java.lang.String PARAMETER_EMAIL
Email (a parameter)

See Also:
Constant Field Values

NODE_BINDESC

public static final java.lang.String NODE_BINDESC
The bin description node

See Also:
Constant Field Values

ATTR_BINREGEXP

public static final java.lang.String ATTR_BINREGEXP
The bin regular expression

See Also:
Constant Field Values

ATTR_INSENSITIVE

public static final java.lang.String ATTR_INSENSITIVE
Whether the match is case insensitive

See Also:
Constant Field Values

NODE_MAXCONNECTIONS

public static final java.lang.String NODE_MAXCONNECTIONS
The max connections node

See Also:
Constant Field Values

NODE_MAXKBPERSECOND

public static final java.lang.String NODE_MAXKBPERSECOND
The bandwidth node

See Also:
Constant Field Values

NODE_MAXFETCHESPERMINUTE

public static final java.lang.String NODE_MAXFETCHESPERMINUTE
The max fetch rate node

See Also:
Constant Field Values

ATTR_VALUE

public static final java.lang.String ATTR_VALUE
The value attribute (used for maxconnections and maxkbpersecond)

See Also:
Constant Field Values

NODE_ACCESSCREDENTIAL

public static final java.lang.String NODE_ACCESSCREDENTIAL
Access control description node

See Also:
Constant Field Values

ATTR_URLREGEXP

public static final java.lang.String ATTR_URLREGEXP
Regexp for access control node

See Also:
Constant Field Values

ATTR_TYPE

public static final java.lang.String ATTR_TYPE
Type of security

See Also:
Constant Field Values

ATTRVALUE_BASIC

public static final java.lang.String ATTRVALUE_BASIC
Type value for basic authentication

See Also:
Constant Field Values

ATTRVALUE_NTLM

public static final java.lang.String ATTRVALUE_NTLM
Type value for NTLM authentication

See Also:
Constant Field Values

ATTRVALUE_SESSION

public static final java.lang.String ATTRVALUE_SESSION
Type value for session-based authentication

See Also:
Constant Field Values

ATTR_DOMAIN

public static final java.lang.String ATTR_DOMAIN
Domain/realm part of credentials (if any)

See Also:
Constant Field Values

ATTR_USERNAME

public static final java.lang.String ATTR_USERNAME
Username part of credentials

See Also:
Constant Field Values

ATTR_PASSWORD

public static final java.lang.String ATTR_PASSWORD
Password part of credentials

See Also:
Constant Field Values

NODE_AUTHPAGE

public static final java.lang.String NODE_AUTHPAGE
Authentication page description node

See Also:
Constant Field Values

ATTRVALUE_FORM

public static final java.lang.String ATTRVALUE_FORM
Authentication page type: Form

See Also:
Constant Field Values

ATTRVALUE_LINK

public static final java.lang.String ATTRVALUE_LINK
Authentication page type: Link

See Also:
Constant Field Values

ATTRVALUE_REDIRECTION

public static final java.lang.String ATTRVALUE_REDIRECTION
Authentication page type: Redirection

See Also:
Constant Field Values

ATTR_MATCHREGEXP

public static final java.lang.String ATTR_MATCHREGEXP
Form name or link target regexp for authentication page

See Also:
Constant Field Values

NODE_AUTHPARAMETER

public static final java.lang.String NODE_AUTHPARAMETER
Authentication parameter node

See Also:
Constant Field Values

ATTR_NAMEREGEXP

public static final java.lang.String ATTR_NAMEREGEXP
Authentication parameter name regexp

See Also:
Constant Field Values

NODE_TRUST

public static final java.lang.String NODE_TRUST
Trust store description node

See Also:
Constant Field Values

ATTR_TRUSTSTORE

public static final java.lang.String ATTR_TRUSTSTORE
Trust store section of authentication record

See Also:
Constant Field Values

ATTR_TRUSTEVERYTHING

public static final java.lang.String ATTR_TRUSTEVERYTHING
"Trust everything" attribute - replacing truststore if set to 'true'

See Also:
Constant Field Values

NODE_SEEDS

public static final java.lang.String NODE_SEEDS
The seeds node. The value of this node contains the seeds, as a large text area.

See Also:
Constant Field Values

NODE_INCLUDES

public static final java.lang.String NODE_INCLUDES
Include regexps node. The value of this node contains the regexps that must match the canonical URL in order for that URL to be included. These regexps are newline separated, and # starts a comment.

See Also:
Constant Field Values

NODE_EXCLUDES

public static final java.lang.String NODE_EXCLUDES
Exclude regexps node. The value of this node contains the regexps that if any one matches, causes the URL to be excluded. These regexps are newline separated, and # starts a comment.

See Also:
Constant Field Values

NODE_LIMITTOSEEDS

public static final java.lang.String NODE_LIMITTOSEEDS
Limit to seeds. When value attribute is true, only seed domains will be permitted.

See Also:
Constant Field Values
Constructor Detail

WebcrawlerConfig

public WebcrawlerConfig()