org.apache.manifoldcf.crawler.jobs
Class IntrinsicLink

java.lang.Object
  extended by org.apache.manifoldcf.core.database.BaseTable
      extended by org.apache.manifoldcf.crawler.jobs.IntrinsicLink

public class IntrinsicLink
extends BaseTable

This class manages the table that keeps track of intrinsic relationships between documents.


Nested Class Summary
protected static class IntrinsicLink.DuplicateFinder
           
 
Field Summary
static java.lang.String _rcsid
           
static java.lang.String childIDHashField
           
static java.lang.String jobIDField
           
protected static int LINKSTATUS_BASE
          The standard value for this field.
protected static int LINKSTATUS_EXISTING
          This value means that the link existed before, and has been found during this scan.
protected static int LINKSTATUS_NEW
          This value means that the link is brand-new; it did not exist before this pass.
protected static java.util.Map linkstatusMap
           
static java.lang.String linkTypeField
           
static java.lang.String newField
           
static java.lang.String parentIDHashField
           
 
Fields inherited from class org.apache.manifoldcf.core.database.BaseTable
dbInterface, tableName
 
Constructor Summary
IntrinsicLink(IDBInterface database)
          Constructor.
 
Method Summary
 void analyzeTables()
          Analyze job tables that need analysis.
 void deinstall()
          Uninstall.
 void deleteOwner(java.lang.Long jobID)
          Delete an owner (and clean up the corresponding hopcount rows).
 IResultSet getDocumentChildren(java.lang.Long jobID, java.lang.String parentIDHash)
          Get document's children.
 java.lang.String[] getDocumentUniqueParents(java.lang.Long jobID, java.lang.String childIDHash)
          Get document's parents.
 void install(java.lang.String jobsTable, java.lang.String jobsColumn)
          Install or upgrade.
protected  void performExistsCheck(java.util.Map presentMap, java.lang.String query, java.util.ArrayList list)
          Do the exists check, in batch.
protected  void performRemoveLinks(java.lang.String query, java.util.ArrayList list, java.lang.String commonNewExpression, java.util.ArrayList commonNewParams)
           
protected  void performRestoreLinks(java.lang.String query, java.util.ArrayList list)
           
 java.lang.String[] recordReferences(java.lang.Long jobID, java.lang.String sourceDocumentIDHash, java.lang.String[] targetDocumentIDHashes, java.lang.String linkType)
          Record a references from source to targets.
 void removeLinks(java.lang.Long jobID, java.lang.String commonNewExpression, java.util.ArrayList commonNewParams, java.lang.String[] sourceDocumentIDHashes, java.lang.String sourceTableName, java.lang.String sourceTableIDColumn, java.lang.String sourceTableJobColumn, java.lang.String sourceTableCriteria, java.util.ArrayList sourceTableParams)
          Remove all target links of the specified source documents that are not marked as "new" or "existing", and return the others to their base state.
 void reset()
          Reset, at startup time.
 void restoreLinks(java.lang.Long jobID, java.lang.String[] sourceDocumentIDHashes)
          Return all target links of the specified source documents to their base state.
static java.lang.String statusToString(int status)
          Convert link status to string
static int stringToStatus(java.lang.String status)
          Convert string to link status.
 
Methods inherited from class org.apache.manifoldcf.core.database.BaseTable
addTableIndex, analyzeTable, beginTransaction, constructDistinctOnClause, constructOffsetLimitClause, constructRegexpClause, constructSubstringClause, endTransaction, getDatabaseCacheKey, getDBInterface, getMaxInClause, getMaxOrClause, getTableIndexes, getTableName, getTableSchema, getTransactionID, makeTableKey, noteModifications, performAddIndex, performAlter, performCreate, performDelete, performDrop, performInsert, performLock, performModification, performQuery, performQuery, performRemoveIndex, performUpdate, prepareRowForSave, readRow, reindexTable, signalRollback
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_rcsid

public static final java.lang.String _rcsid
See Also:
Constant Field Values

LINKSTATUS_BASE

protected static final int LINKSTATUS_BASE
The standard value for this field. Means that the link existed prior to this scan, and no new link was found yet.

See Also:
Constant Field Values

LINKSTATUS_NEW

protected static final int LINKSTATUS_NEW
This value means that the link is brand-new; it did not exist before this pass.

See Also:
Constant Field Values

LINKSTATUS_EXISTING

protected static final int LINKSTATUS_EXISTING
This value means that the link existed before, and has been found during this scan.

See Also:
Constant Field Values

jobIDField

public static final java.lang.String jobIDField
See Also:
Constant Field Values

linkTypeField

public static final java.lang.String linkTypeField
See Also:
Constant Field Values

parentIDHashField

public static final java.lang.String parentIDHashField
See Also:
Constant Field Values

childIDHashField

public static final java.lang.String childIDHashField
See Also:
Constant Field Values

newField

public static final java.lang.String newField
See Also:
Constant Field Values

linkstatusMap

protected static java.util.Map linkstatusMap
Constructor Detail

IntrinsicLink

public IntrinsicLink(IDBInterface database)
              throws ManifoldCFException
Constructor.

Parameters:
database - is the database handle.
Throws:
ManifoldCFException
Method Detail

install

public void install(java.lang.String jobsTable,
                    java.lang.String jobsColumn)
             throws ManifoldCFException
Install or upgrade.

Throws:
ManifoldCFException

deinstall

public void deinstall()
               throws ManifoldCFException
Uninstall.

Throws:
ManifoldCFException

analyzeTables

public void analyzeTables()
                   throws ManifoldCFException
Analyze job tables that need analysis.

Throws:
ManifoldCFException

deleteOwner

public void deleteOwner(java.lang.Long jobID)
                 throws ManifoldCFException
Delete an owner (and clean up the corresponding hopcount rows).

Throws:
ManifoldCFException

reset

public void reset()
           throws ManifoldCFException
Reset, at startup time. Since links can only be added in a transactionally safe way by processing of documents, and cached records of hopcount are updated only when requested, it is safest to simply move any "new" or "new existing" links back to base state on startup. Then, the next time that page is processed, the links will be updated properly.

Throws:
ManifoldCFException

recordReferences

public java.lang.String[] recordReferences(java.lang.Long jobID,
                                           java.lang.String sourceDocumentIDHash,
                                           java.lang.String[] targetDocumentIDHashes,
                                           java.lang.String linkType)
                                    throws ManifoldCFException
Record a references from source to targets. These references will be marked as either "new" or "existing".

Returns:
the target document ID's that are considered "new".
Throws:
ManifoldCFException

performExistsCheck

protected void performExistsCheck(java.util.Map presentMap,
                                  java.lang.String query,
                                  java.util.ArrayList list)
                           throws ManifoldCFException
Do the exists check, in batch.

Throws:
ManifoldCFException

removeLinks

public void removeLinks(java.lang.Long jobID,
                        java.lang.String commonNewExpression,
                        java.util.ArrayList commonNewParams,
                        java.lang.String[] sourceDocumentIDHashes,
                        java.lang.String sourceTableName,
                        java.lang.String sourceTableIDColumn,
                        java.lang.String sourceTableJobColumn,
                        java.lang.String sourceTableCriteria,
                        java.util.ArrayList sourceTableParams)
                 throws ManifoldCFException
Remove all target links of the specified source documents that are not marked as "new" or "existing", and return the others to their base state.

Throws:
ManifoldCFException

performRemoveLinks

protected void performRemoveLinks(java.lang.String query,
                                  java.util.ArrayList list,
                                  java.lang.String commonNewExpression,
                                  java.util.ArrayList commonNewParams)
                           throws ManifoldCFException
Throws:
ManifoldCFException

restoreLinks

public void restoreLinks(java.lang.Long jobID,
                         java.lang.String[] sourceDocumentIDHashes)
                  throws ManifoldCFException
Return all target links of the specified source documents to their base state.

Throws:
ManifoldCFException

performRestoreLinks

protected void performRestoreLinks(java.lang.String query,
                                   java.util.ArrayList list)
                            throws ManifoldCFException
Throws:
ManifoldCFException

getDocumentChildren

public IResultSet getDocumentChildren(java.lang.Long jobID,
                                      java.lang.String parentIDHash)
                               throws ManifoldCFException
Get document's children.

Returns:
rows that contain the children. Column names are 'linktype','childidentifier'.
Throws:
ManifoldCFException

getDocumentUniqueParents

public java.lang.String[] getDocumentUniqueParents(java.lang.Long jobID,
                                                   java.lang.String childIDHash)
                                            throws ManifoldCFException
Get document's parents.

Returns:
a set of document identifier hashes that constitute parents of the specified identifier.
Throws:
ManifoldCFException

stringToStatus

public static int stringToStatus(java.lang.String status)
Convert string to link status.


statusToString

public static java.lang.String statusToString(int status)
Convert link status to string