org.apache.manifoldcf.crawler.repository
Class RepositoryConnectionManager

java.lang.Object
  extended by org.apache.manifoldcf.core.database.BaseTable
      extended by org.apache.manifoldcf.crawler.repository.RepositoryConnectionManager
All Implemented Interfaces:
IRepositoryConnectionManager

public class RepositoryConnectionManager
extends BaseTable
implements IRepositoryConnectionManager

This class is the manager of the repository connection description. Inside, multiple database tables are managed, with appropriate caching. Note well: The database handle is instantiated here using the DBInterfaceFactory. This is acceptable because the actual database that this table is located in is fixed.


Nested Class Summary
protected static class RepositoryConnectionManager.RepositoryConnectionDescription
          This is the object description for a repository connection object.
protected static class RepositoryConnectionManager.RepositoryConnectionExecutor
          This is the executor object for locating repository connection objects.
 
Field Summary
static java.lang.String _rcsid
           
protected static java.lang.String authorityNameField
           
protected static java.lang.String classNameField
           
protected static java.lang.String configField
           
protected static java.lang.String descriptionField
           
protected  RepositoryHistoryManager historyManager
           
protected static java.lang.String maxCountField
           
protected static java.lang.String nameField
           
protected static java.util.Random random
           
protected  ThrottleSpecManager throttleSpecManager
           
 
Fields inherited from class org.apache.manifoldcf.core.database.BaseTable
dbInterface, tableName
 
Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectionManager
ACTIVITY_JOBABORT, ACTIVITY_JOBCONTINUE, ACTIVITY_JOBEND, ACTIVITY_JOBSTART, ACTIVITY_JOBWAIT, activitySet
 
Constructor Summary
RepositoryConnectionManager(IThreadContext threadContext, IDBInterface database)
          Constructor.
 
Method Summary
 boolean checkConnectorExists(java.lang.String name)
          Check if underlying connector exists.
 long countHistoryRows(java.lang.String connectionName, FilterCriteria criteria)
          Count the number of rows specified by a given set of criteria.
 IRepositoryConnection create()
          Create a new repository connection object.
 void deinstall()
          Uninstall the manager.
 void delete(java.lang.String name)
          Delete a repository connection.
 void exportConfiguration(java.io.OutputStream os)
          Export configuration
 java.lang.String[] findConnectionsForConnector(java.lang.String className)
          Get a list of repository connections that share the same connector.
 IResultSet genHistoryActivityCount(java.lang.String connectionName, FilterCriteria criteria, SortOrder sort, BucketDescription idBucket, long interval, int startRow, int maxRowCount)
          Generate a report, listing the start time, activity count, and identifier bucket, given a time slice (interval) size.
 IResultSet genHistoryByteCount(java.lang.String connectionName, FilterCriteria criteria, SortOrder sort, BucketDescription idBucket, long interval, int startRow, int maxRowCount)
          Generate a report, listing the start time, bytes processed, and identifier bucket, given a time slice (interval) size.
 IResultSet genHistoryResultCodes(java.lang.String connectionName, FilterCriteria criteria, SortOrder sort, BucketDescription resultCodeBucket, BucketDescription idBucket, int startRow, int maxRowCount)
          Generate a report, listing the result bucket and identifier bucket.
 IResultSet genHistorySimple(java.lang.String connectionName, FilterCriteria criteria, SortOrder sort, int startRow, int maxRowCount)
          Generate a report, listing the start time, elapsed time, result code and description, number of bytes, and entity identifier.
 IRepositoryConnection[] getAllConnections()
          Obtain a list of the repository connections, ordered by name.
 java.lang.String getConnectionNameColumn()
          Return the name column.
protected static java.lang.String getRepositoryConnectionKey(java.lang.String connectionName)
          Construct a key which represents an individual repository connection.
protected  void getRepositoryConnectionsChunk(RepositoryConnection[] rval, java.util.Map returnIndex, java.lang.String idList, java.util.ArrayList params)
          Read a chunk of repository connections.
protected static java.lang.String getRepositoryConnectionsKey()
          Construct a key which represents the general list of repository connectors.
protected  RepositoryConnection[] getRepositoryConnectionsMultiple(java.lang.String[] connectionNames)
          Fetch multiple repository connections at a single time.
 void importConfiguration(java.io.InputStream is)
          Import configuration
 void install()
          Install the manager.
 boolean isReferenced(java.lang.String authorityName)
          Return true if the specified authority name is referenced.
 IRepositoryConnection load(java.lang.String name)
          Load a repository connection by name.
 IRepositoryConnection[] loadMultiple(java.lang.String[] names)
          Load multiple repository connections by name.
 void recordHistory(java.lang.String connectionName, java.lang.Long startTime, java.lang.String activityType, java.lang.Long dataSize, java.lang.String entityIdentifier, java.lang.String resultCode, java.lang.String resultDescription, java.lang.String[] childIdentifiers)
          Record time-stamped information about the activity of the connection.
 boolean save(IRepositoryConnection object)
          Save a repository connection object.
 
Methods inherited from class org.apache.manifoldcf.core.database.BaseTable
addTableIndex, analyzeTable, beginTransaction, constructDistinctOnClause, constructOffsetLimitClause, constructRegexpClause, constructSubstringClause, endTransaction, getDatabaseCacheKey, getDBInterface, getMaxInClause, getMaxOrClause, getTableIndexes, getTableName, getTableSchema, getTransactionID, makeTableKey, noteModifications, performAddIndex, performAlter, performCreate, performDelete, performDrop, performInsert, performLock, performModification, performQuery, performQuery, performRemoveIndex, performUpdate, prepareRowForSave, readRow, reindexTable, signalRollback
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectionManager
getTableName
 

Field Detail

_rcsid

public static final java.lang.String _rcsid
See Also:
Constant Field Values

nameField

protected static final java.lang.String nameField
See Also:
Constant Field Values

descriptionField

protected static final java.lang.String descriptionField
See Also:
Constant Field Values

classNameField

protected static final java.lang.String classNameField
See Also:
Constant Field Values

authorityNameField

protected static final java.lang.String authorityNameField
See Also:
Constant Field Values

maxCountField

protected static final java.lang.String maxCountField
See Also:
Constant Field Values

configField

protected static final java.lang.String configField
See Also:
Constant Field Values

random

protected static java.util.Random random

historyManager

protected RepositoryHistoryManager historyManager

throttleSpecManager

protected ThrottleSpecManager throttleSpecManager
Constructor Detail

RepositoryConnectionManager

public RepositoryConnectionManager(IThreadContext threadContext,
                                   IDBInterface database)
                            throws ManifoldCFException
Constructor.

Parameters:
threadContext - is the thread context.
Throws:
ManifoldCFException
Method Detail

install

public void install()
             throws ManifoldCFException
Install the manager.

Specified by:
install in interface IRepositoryConnectionManager
Throws:
ManifoldCFException

deinstall

public void deinstall()
               throws ManifoldCFException
Uninstall the manager.

Specified by:
deinstall in interface IRepositoryConnectionManager
Throws:
ManifoldCFException

exportConfiguration

public void exportConfiguration(java.io.OutputStream os)
                         throws java.io.IOException,
                                ManifoldCFException
Export configuration

Specified by:
exportConfiguration in interface IRepositoryConnectionManager
Throws:
java.io.IOException
ManifoldCFException

importConfiguration

public void importConfiguration(java.io.InputStream is)
                         throws java.io.IOException,
                                ManifoldCFException
Import configuration

Specified by:
importConfiguration in interface IRepositoryConnectionManager
Throws:
java.io.IOException
ManifoldCFException

getAllConnections

public IRepositoryConnection[] getAllConnections()
                                          throws ManifoldCFException
Obtain a list of the repository connections, ordered by name.

Specified by:
getAllConnections in interface IRepositoryConnectionManager
Returns:
an array of connection objects.
Throws:
ManifoldCFException

load

public IRepositoryConnection load(java.lang.String name)
                           throws ManifoldCFException
Load a repository connection by name.

Specified by:
load in interface IRepositoryConnectionManager
Parameters:
name - is the name of the repository connection.
Returns:
the loaded connection object, or null if not found.
Throws:
ManifoldCFException

loadMultiple

public IRepositoryConnection[] loadMultiple(java.lang.String[] names)
                                     throws ManifoldCFException
Load multiple repository connections by name.

Specified by:
loadMultiple in interface IRepositoryConnectionManager
Parameters:
names - are the names to load.
Returns:
the loaded connection objects.
Throws:
ManifoldCFException

create

public IRepositoryConnection create()
                             throws ManifoldCFException
Create a new repository connection object.

Specified by:
create in interface IRepositoryConnectionManager
Returns:
the new object.
Throws:
ManifoldCFException

save

public boolean save(IRepositoryConnection object)
             throws ManifoldCFException
Save a repository connection object.

Specified by:
save in interface IRepositoryConnectionManager
Parameters:
object - is the object to save.
Returns:
true if the object was created, false otherwise.
Throws:
ManifoldCFException

delete

public void delete(java.lang.String name)
            throws ManifoldCFException
Delete a repository connection.

Specified by:
delete in interface IRepositoryConnectionManager
Parameters:
name - is the name of the connection to delete. If the name does not exist, no error is returned.
Throws:
ManifoldCFException

isReferenced

public boolean isReferenced(java.lang.String authorityName)
                     throws ManifoldCFException
Return true if the specified authority name is referenced.

Specified by:
isReferenced in interface IRepositoryConnectionManager
Parameters:
authorityName - is the authority name.
Returns:
true if referenced, false otherwise.
Throws:
ManifoldCFException

findConnectionsForConnector

public java.lang.String[] findConnectionsForConnector(java.lang.String className)
                                               throws ManifoldCFException
Get a list of repository connections that share the same connector.

Specified by:
findConnectionsForConnector in interface IRepositoryConnectionManager
Parameters:
className - is the class name of the connector.
Returns:
the repository connections that use that connector.
Throws:
ManifoldCFException

checkConnectorExists

public boolean checkConnectorExists(java.lang.String name)
                             throws ManifoldCFException
Check if underlying connector exists.

Specified by:
checkConnectorExists in interface IRepositoryConnectionManager
Parameters:
name - is the name of the connection to check.
Returns:
true if the underlying connector is registered.
Throws:
ManifoldCFException

getConnectionNameColumn

public java.lang.String getConnectionNameColumn()
Return the name column.

Specified by:
getConnectionNameColumn in interface IRepositoryConnectionManager
Returns:
the name column.

recordHistory

public void recordHistory(java.lang.String connectionName,
                          java.lang.Long startTime,
                          java.lang.String activityType,
                          java.lang.Long dataSize,
                          java.lang.String entityIdentifier,
                          java.lang.String resultCode,
                          java.lang.String resultDescription,
                          java.lang.String[] childIdentifiers)
                   throws ManifoldCFException
Record time-stamped information about the activity of the connection. This information can originate from either the connector or from the framework. The reason it is here is that it is viewed as 'belonging' to an individual connection, and is segregated accordingly.

Specified by:
recordHistory in interface IRepositoryConnectionManager
Parameters:
connectionName - is the connection to which the record belongs. If the connection is deleted, the corresponding records will also be deleted. Cannot be null.
startTime - is either null or the time since the start of epoch in milliseconds (Jan 1, 1970). Every activity has an associated time; the startTime field records when the activity began. A null value indicates that the start time and the finishing time are the same.
activityType - is a string which is fully interpretable only in the context of the connector involved, which is used to categorize what kind of activity is being recorded. For example, a web connector might record a "fetch document" activity, while the framework might record "ingest document", "job start", "job finish", "job abort", etc. Cannot be null.
dataSize - is the number of bytes of data involved in the activity, or null if not applicable.
entityIdentifier - is a (possibly long) string which identifies the object involved in the history record. The interpretation of this field will differ from connector to connector. May be null.
resultCode - contains a terse description of the result of the activity. The description is limited in size to 255 characters, and can be interpreted only in the context of the current connector. May be null.
resultDescription - is a (possibly long) human-readable string which adds detail, if required, to the result described in the resultCode field. This field is not meant to be queried on. May be null.
childIdentifiers - is a set of child entity identifiers associated with this activity. May be null.
Throws:
ManifoldCFException

countHistoryRows

public long countHistoryRows(java.lang.String connectionName,
                             FilterCriteria criteria)
                      throws ManifoldCFException
Count the number of rows specified by a given set of criteria. This can be used to make decisions as to whether a query based on those rows will complete in an acceptable amount of time.

Specified by:
countHistoryRows in interface IRepositoryConnectionManager
Parameters:
connectionName - is the name of the connection.
criteria - is the filtering criteria, which selects the records of interest.
Returns:
the number of rows included by the criteria.
Throws:
ManifoldCFException

genHistorySimple

public IResultSet genHistorySimple(java.lang.String connectionName,
                                   FilterCriteria criteria,
                                   SortOrder sort,
                                   int startRow,
                                   int maxRowCount)
                            throws ManifoldCFException
Generate a report, listing the start time, elapsed time, result code and description, number of bytes, and entity identifier. The records selected for this report are based on the filtering criteria object passed into this method. The record order is based on the sorting criteria object passed into this method. The resultset returned should have the following columns: "starttime","elapsedtime","resultcode","resultdesc","bytes","identifier".

Specified by:
genHistorySimple in interface IRepositoryConnectionManager
Parameters:
connectionName - is the name of the connection.
criteria - is the filtering criteria, which selects the records of interest.
sort - is the sorting order, which can specify sort based on the result columns.
startRow - is the first row to include (beginning with 0)
maxRowCount - is the maximum number of rows to include.
Throws:
ManifoldCFException

genHistoryActivityCount

public IResultSet genHistoryActivityCount(java.lang.String connectionName,
                                          FilterCriteria criteria,
                                          SortOrder sort,
                                          BucketDescription idBucket,
                                          long interval,
                                          int startRow,
                                          int maxRowCount)
                                   throws ManifoldCFException
Generate a report, listing the start time, activity count, and identifier bucket, given a time slice (interval) size. The records selected for this report are based on the filtering criteria object passed into this method. The record order is based on the sorting criteria object passed into this method. The identifier bucket description is specified by the bucket description object. The resultset returned should have the following columns: "starttime","endtime","activitycount","idbucket".

Specified by:
genHistoryActivityCount in interface IRepositoryConnectionManager
Parameters:
connectionName - is the name of the connection.
criteria - is the filtering criteria, which selects the records of interest.
sort - is the sorting order, which can specify sort based on the result columns.
idBucket - is the description of the bucket based on processed entity identifiers.
interval - is the time interval, in milliseconds, to locate. There will be one row in the resultset for each distinct idBucket value, and the returned activity count will the maximum found over the specified interval size.
startRow - is the first row to include (beginning with 0)
maxRowCount - is the maximum number of rows to include.
Throws:
ManifoldCFException

genHistoryByteCount

public IResultSet genHistoryByteCount(java.lang.String connectionName,
                                      FilterCriteria criteria,
                                      SortOrder sort,
                                      BucketDescription idBucket,
                                      long interval,
                                      int startRow,
                                      int maxRowCount)
                               throws ManifoldCFException
Generate a report, listing the start time, bytes processed, and identifier bucket, given a time slice (interval) size. The records selected for this report are based on the filtering criteria object passed into this method. The record order is based on the sorting criteria object passed into this method. The identifier bucket description is specified by the bucket description object. The resultset returned should have the following columns: "starttime","endtime","bytecount","idbucket".

Specified by:
genHistoryByteCount in interface IRepositoryConnectionManager
Parameters:
connectionName - is the name of the connection.
criteria - is the filtering criteria, which selects the records of interest.
sort - is the sorting order, which can specify sort based on the result columns.
idBucket - is the description of the bucket based on processed entity identifiers.
interval - is the time interval, in milliseconds, to locate. There will be one row in the resultset for each distinct idBucket value, and the returned activity count will the maximum found over the specified interval size.
startRow - is the first row to include (beginning with 0)
maxRowCount - is the maximum number of rows to include.
Throws:
ManifoldCFException

genHistoryResultCodes

public IResultSet genHistoryResultCodes(java.lang.String connectionName,
                                        FilterCriteria criteria,
                                        SortOrder sort,
                                        BucketDescription resultCodeBucket,
                                        BucketDescription idBucket,
                                        int startRow,
                                        int maxRowCount)
                                 throws ManifoldCFException
Generate a report, listing the result bucket and identifier bucket. The records selected for this report are based on the filtering criteria object passed into this method. The record order is based on the sorting criteria object passed into this method. The result code bucket description is specified by a bucket description object. The identifier bucket description is specified by a bucket description object. The resultset returned should have the following columns: "resultcodebucket","idbucket".

Specified by:
genHistoryResultCodes in interface IRepositoryConnectionManager
Parameters:
connectionName - is the name of the connection.
criteria - is the filtering criteria, which selects the records of interest.
sort - is the sorting order, which can specify sort based on the result columns.
resultCodeBucket - is the description of the bucket based on processed result codes.
idBucket - is the description of the bucket based on processed entity identifiers.
startRow - is the first row to include (beginning with 0)
maxRowCount - is the maximum number of rows to include.
Throws:
ManifoldCFException

getRepositoryConnectionsKey

protected static java.lang.String getRepositoryConnectionsKey()
Construct a key which represents the general list of repository connectors.

Returns:
the cache key.

getRepositoryConnectionKey

protected static java.lang.String getRepositoryConnectionKey(java.lang.String connectionName)
Construct a key which represents an individual repository connection.

Parameters:
connectionName - is the name of the connector.
Returns:
the cache key.

getRepositoryConnectionsMultiple

protected RepositoryConnection[] getRepositoryConnectionsMultiple(java.lang.String[] connectionNames)
                                                           throws ManifoldCFException
Fetch multiple repository connections at a single time.

Parameters:
connectionNames - are a list of connection names.
Returns:
the corresponding repository connection objects.
Throws:
ManifoldCFException

getRepositoryConnectionsChunk

protected void getRepositoryConnectionsChunk(RepositoryConnection[] rval,
                                             java.util.Map returnIndex,
                                             java.lang.String idList,
                                             java.util.ArrayList params)
                                      throws ManifoldCFException
Read a chunk of repository connections.

Parameters:
rval - is the place to put the read policies.
returnIndex - is a map from the object id (resource id) and the rval index.
idList - is the list of id's.
params - is the set of parameters.
Throws:
ManifoldCFException