org.apache.manifoldcf.crawler.jobs
Class HopCount.DocumentHash

java.lang.Object
  extended by org.apache.manifoldcf.crawler.jobs.HopCount.DocumentHash
Enclosing class:
HopCount

protected class HopCount.DocumentHash
extends java.lang.Object

The Document Hash structure contains the document nodes we are interested in, including those we need answers for to proceed. The main interface involves specifying a set of questions and receiving the answers. This structure permits multiple requests to be made to each object, and in-memory caching is used to reduce the amount of database activity as much as possible. It is also presumed that these requests take place inside of the appropriate transactions, since both read and write database activity may well occur.


Field Summary
protected  HopCount.NodeQueue childFetchQueue
          This is the queue for nodes that need to be initialized, who need child fetching.
protected  HopCount.NodeQueue evaluationQueue
          This is the queue for evaluating nodes.
protected  int hopcountMethod
          The hopcount method
protected  java.lang.Long jobID
          The job identifier
protected  java.lang.String[] legalLinkTypes
          These are the legal link types for the job
protected  java.util.Map questionLookupMap
          This is the map of known questions to DocumentNode objects.
 
Constructor Summary
HopCount.DocumentHash(java.lang.Long jobID, java.lang.String[] legalLinkTypes, int hopcountMethod)
          Constructor
 
Method Summary
 int[] askQuestions(HopCount.Question[] questions)
          Throw in some questions, and prepare for the answers.
protected  void evaluateNode(HopCount.DocumentNode node)
          Evaluate a node from the evaluation queue.
protected  void findChildren(java.util.Map referenceMap, java.lang.String query, java.util.ArrayList list)
          Get the children of a bunch of nodes.
protected  void getNodeChildren(HopCount.DocumentNode[] nodes)
          Fetch a the children of a bunch of nodes, and initialize all of the nodes appropriately.
protected  void makeNodeComplete(HopCount.DocumentNode node)
          Make a node be complete.
protected  void notifyParents(HopCount.DocumentNode node)
          Notify parents of a node's change of state.
protected  void queueParents(HopCount.DocumentNode node)
          Queue the parents on the evaluation queue.
protected  HopCount.DocumentNode[] queueQuestions(HopCount.Question[] questions)
          Queue up a set of questions.
protected  void removeChildLinks(HopCount.DocumentNode dn)
          Remove remaining links to children.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

jobID

protected java.lang.Long jobID
The job identifier


questionLookupMap

protected java.util.Map questionLookupMap
This is the map of known questions to DocumentNode objects.


childFetchQueue

protected HopCount.NodeQueue childFetchQueue
This is the queue for nodes that need to be initialized, who need child fetching.


evaluationQueue

protected HopCount.NodeQueue evaluationQueue
This is the queue for evaluating nodes. For all of these nodes, the processing has begun: all child nodes have been queued, and at least a partial answer is present. Evaluating one of these nodes involves potentially updating the node's answer, and when that is done, all listed parents will be requeued on this queue.


legalLinkTypes

protected java.lang.String[] legalLinkTypes
These are the legal link types for the job


hopcountMethod

protected int hopcountMethod
The hopcount method

Constructor Detail

HopCount.DocumentHash

public HopCount.DocumentHash(java.lang.Long jobID,
                             java.lang.String[] legalLinkTypes,
                             int hopcountMethod)
Constructor

Method Detail

askQuestions

public int[] askQuestions(HopCount.Question[] questions)
                   throws ManifoldCFException
Throw in some questions, and prepare for the answers.

Throws:
ManifoldCFException

evaluateNode

protected void evaluateNode(HopCount.DocumentNode node)
                     throws ManifoldCFException
Evaluate a node from the evaluation queue.

Throws:
ManifoldCFException

getNodeChildren

protected void getNodeChildren(HopCount.DocumentNode[] nodes)
                        throws ManifoldCFException
Fetch a the children of a bunch of nodes, and initialize all of the nodes appropriately.

Throws:
ManifoldCFException

findChildren

protected void findChildren(java.util.Map referenceMap,
                            java.lang.String query,
                            java.util.ArrayList list)
                     throws ManifoldCFException
Get the children of a bunch of nodes.

Throws:
ManifoldCFException

queueParents

protected void queueParents(HopCount.DocumentNode node)
Queue the parents on the evaluation queue.


makeNodeComplete

protected void makeNodeComplete(HopCount.DocumentNode node)
                         throws ManifoldCFException
Make a node be complete. This involves writing the node's data to the database, if appropriate.

Throws:
ManifoldCFException

queueQuestions

protected HopCount.DocumentNode[] queueQuestions(HopCount.Question[] questions)
                                          throws ManifoldCFException
Queue up a set of questions. If the question is completed, nothing is done and the node is returned. If the question is queued already, the node may be modified if the question is more specific than what was already there. In any case, if the answer isn't ready, null is returned.

Parameters:
questions - are the set of questions.
Throws:
ManifoldCFException

notifyParents

protected void notifyParents(HopCount.DocumentNode node)
Notify parents of a node's change of state.


removeChildLinks

protected void removeChildLinks(HopCount.DocumentNode dn)
Remove remaining links to children.