|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.manifoldcf.core.connector.BaseConnector
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
public abstract class BaseRepositoryConnector
This base class describes an instance of a connection between a repository and ManifoldCF's standard "pull" ingestion agent. Each instance of this interface is used in only one thread at a time. Connection Pooling on these kinds of objects is performed by the factory which instantiates repository connectors from symbolic names and config parameters, and is pooled by these parameters. That is, a pooled connector handle is used only if all the connection parameters for the handle match. Implementers of this interface should provide a default constructor which has this signature: xxx(); Connectors are either configured or not. If configured, they will persist in a pool, and be reused multiple times. Certain methods of a connector may be called before the connector is configured. This includes basically all methods that permit inspection of the connector's capabilities. The complete list is: The purpose of the repository connector is to allow documents to be fetched from the repository. Each repository connector describes a set of documents that are known only to that connector. It therefore establishes a space of document identifiers. Each connector will only ever be asked to deal with identifiers that have in some way originated from the connector. Documents are fetched in three stages. First, the getDocuments() method is called in the connector implementation. This returns a set of document identifiers. The document identifiers are used to obtain the current document version strings in the second stage, using the getDocumentVersions() method. The last stage is processDocuments(), which queues up any additional documents needed, and also ingests. This method will not be called if the document version seems to indicate that no document change took place.
| Field Summary | |
|---|---|
static java.lang.String |
_rcsid
|
| Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector |
|---|
currentContext, params |
| Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector |
|---|
JOBMODE_CONTINUOUS, JOBMODE_ONCEONLY, MODEL_ADD, MODEL_ADD_CHANGE, MODEL_ADD_CHANGE_DELETE, MODEL_ALL, MODEL_PARTIAL |
| Constructor Summary | |
|---|---|
BaseRepositoryConnector()
|
|
| Method Summary | |
|---|---|
void |
addSeedDocuments(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
Queue "seed" documents. |
void |
addSeedDocuments(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime,
int jobMode)
Queue "seed" documents. |
java.lang.String[] |
getActivitiesList()
Return the list of activities that this connector supports (i.e. |
java.lang.String[] |
getBinNames(java.lang.String documentIdentifier)
Get the bin name strings for a document identifier. |
int |
getConnectorModel()
Tell the world what model this connector uses for getDocumentIdentifiers(). |
IDocumentIdentifierStream |
getDocumentIdentifiers(DocumentSpecification spec,
long startTime,
long endTime)
The short version of getDocumentIdentifiers. |
IDocumentIdentifierStream |
getDocumentIdentifiers(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
The long version of getDocumentIdentifiers. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
DocumentSpecification spec)
The short version of getDocumentVersions. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
IVersionActivity activities,
DocumentSpecification spec)
The long version of getDocumentIdentifiers. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec)
Get document versions given an array of document identifiers. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec,
int jobMode)
Get document versions given an array of document identifiers. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec,
int jobMode,
boolean usesDefaultAuthority)
Get document versions given an array of document identifiers. |
int |
getMaxDocumentRequest()
Get the maximum number of documents to amalgamate together into one batch, for this connector. |
java.lang.String[] |
getRelationshipTypes()
Return the list of relationship types that this connector recognizes. |
IDocumentIdentifierStream |
getRemainingDocumentIdentifiers(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
This method returns the document identifiers that should be considered part of the seeds, but do not need to be queued for processing at this time. |
void |
outputSpecificationBody(IHTTPOutput out,
DocumentSpecification ds,
java.lang.String tabName)
Output the specification body section. |
void |
outputSpecificationHeader(IHTTPOutput out,
DocumentSpecification ds,
java.util.ArrayList tabsArray)
Output the specification header section. |
void |
processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
IProcessActivity activities,
DocumentSpecification spec,
boolean[] scanOnly)
Process a set of documents. |
void |
processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
IProcessActivity activities,
DocumentSpecification spec,
boolean[] scanOnly,
int jobMode)
Process a set of documents. |
java.lang.String |
processSpecificationPost(IPostParameters variableContext,
DocumentSpecification ds)
Process a specification post. |
void |
releaseDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] versions)
Free a set of documents. |
boolean |
requestInfo(Configuration output,
java.lang.String command)
Request arbitrary connector information. |
void |
viewSpecification(IHTTPOutput out,
DocumentSpecification ds)
View specification. |
| Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector |
|---|
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, outputConfigurationBody, outputConfigurationHeader, poll, processConfigurationPost, setThreadContext, viewConfiguration |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.apache.manifoldcf.core.interfaces.IConnector |
|---|
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, outputConfigurationBody, outputConfigurationHeader, poll, processConfigurationPost, setThreadContext, viewConfiguration |
| Field Detail |
|---|
public static final java.lang.String _rcsid
| Constructor Detail |
|---|
public BaseRepositoryConnector()
| Method Detail |
|---|
public int getConnectorModel()
getConnectorModel in interface IRepositoryConnectorpublic java.lang.String[] getActivitiesList()
getActivitiesList in interface IRepositoryConnectorpublic java.lang.String[] getRelationshipTypes()
getRelationshipTypes in interface IRepositoryConnectorpublic java.lang.String[] getBinNames(java.lang.String documentIdentifier)
getBinNames in interface IRepositoryConnectordocumentIdentifier - is the document identifier.
public boolean requestInfo(Configuration output,
java.lang.String command)
throws ManifoldCFException
requestInfo in interface IRepositoryConnectoroutput - is the response object, to be filled in by this method.command - is the command, which is taken directly from the API request.
ManifoldCFException
public void addSeedDocuments(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime,
int jobMode)
throws ManifoldCFException,
ServiceInterruption
addSeedDocuments in interface IRepositoryConnectoractivities - is the interface this method should use to perform whatever framework actions are desired.spec - is a document specification (that comes from the job).startTime - is the beginning of the time range to consider, inclusive.endTime - is the end of the time range to consider, exclusive.jobMode - is an integer describing how the job is being run, whether continuous or once-only.
ManifoldCFException
ServiceInterruption
public void addSeedDocuments(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
throws ManifoldCFException,
ServiceInterruption
activities - is the interface this method should use to perform whatever framework actions are desired.spec - is a document specification (that comes from the job).startTime - is the beginning of the time range to consider, inclusive.endTime - is the end of the time range to consider, exclusive.
ManifoldCFException
ServiceInterruption
public IDocumentIdentifierStream getDocumentIdentifiers(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
throws ManifoldCFException,
ServiceInterruption
activities - is the interface this method should use to perform whatever framework actions are desired.spec - is a document specification (that comes from the job).startTime - is the beginning of the time range to consider, inclusive.endTime - is the end of the time range to consider, exclusive.
ManifoldCFException
ServiceInterruption
public IDocumentIdentifierStream getDocumentIdentifiers(DocumentSpecification spec,
long startTime,
long endTime)
throws ManifoldCFException,
ServiceInterruption
spec - is a document specification (that comes from the job).startTime - is the beginning of the time range to consider, inclusive.endTime - is the end of the time range to consider, exclusive.
ManifoldCFException
ServiceInterruption
public IDocumentIdentifierStream getRemainingDocumentIdentifiers(ISeedingActivity activities,
DocumentSpecification spec,
long startTime,
long endTime)
throws ManifoldCFException,
ServiceInterruption
activities - is the interface this method should use to perform whatever framework actions are desired.spec - is a document specification (that comes from the job).startTime - is the beginning of the time range that was passed to getDocumentIdentifiers().endTime - is the end of the time range to passed to getDocumentIdentifiers().
ManifoldCFException
ServiceInterruption
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec,
int jobMode,
boolean usesDefaultAuthority)
throws ManifoldCFException,
ServiceInterruption
getDocumentVersions in interface IRepositoryConnectordocumentIdentifiers - is the array of local document identifiers, as understood by this connector.oldVersions - is the corresponding array of version strings that have been saved for the document identifiers.
A null value indicates that this is a first-time fetch, while an empty string indicates that the previous document
had an empty version string.activities - is the interface this method should use to perform whatever framework actions are desired.spec - is the current document specification for the current job. If there is a dependency on this
specification, then the version string should include the pertinent data, so that reingestion will occur
when the specification changes. This is primarily useful for metadata.jobMode - is an integer describing how the job is being run, whether continuous or once-only.usesDefaultAuthority - will be true only if the authority in use for these documents is the default one.
ManifoldCFException
ServiceInterruption
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec,
int jobMode)
throws ManifoldCFException,
ServiceInterruption
documentIdentifiers - is the array of local document identifiers, as understood by this connector.oldVersions - is the corresponding array of version strings that have been saved for the document identifiers.
A null value indicates that this is a first-time fetch, while an empty string indicates that the previous document
had an empty version string.activities - is the interface this method should use to perform whatever framework actions are desired.spec - is the current document specification for the current job. If there is a dependency on this
specification, then the version string should include the pertinent data, so that reingestion will occur
when the specification changes. This is primarily useful for metadata.jobMode - is an integer describing how the job is being run, whether continuous or once-only.
ManifoldCFException
ServiceInterruption
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] oldVersions,
IVersionActivity activities,
DocumentSpecification spec)
throws ManifoldCFException,
ServiceInterruption
documentIdentifiers - is the array of local document identifiers, as understood by this connector.oldVersions - is the corresponding array of version strings that have been saved for the document identifiers.
A null value indicates that this is a first-time fetch, while an empty string indicates that the previous document
had an empty version string.activities - is the interface this method should use to perform whatever framework actions are desired.spec - is the current document specification for the current job. If there is a dependency on this
specification, then the version string should include the pertinent data, so that reingestion will occur
when the specification changes. This is primarily useful for metadata.
ManifoldCFException
ServiceInterruption
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
IVersionActivity activities,
DocumentSpecification spec)
throws ManifoldCFException,
ServiceInterruption
documentIdentifiers - is the array of local document identifiers, as understood by this connector.activities - is the interface this method should use to perform whatever framework actions are desired.spec - is the current document specification for the current job. If there is a dependency on this
specification, then the version string should include the pertinent data, so that reingestion will occur
when the specification changes. This is primarily useful for metadata.
ManifoldCFException
ServiceInterruption
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
DocumentSpecification spec)
throws ManifoldCFException,
ServiceInterruption
documentIdentifiers - is the array of local document identifiers, as understood by this connector.spec - is the current document specification for the current job. If there is a dependency on this
specification, then the version string should include the pertinent data, so that reingestion will occur
when the specification changes. This is primarily useful for metadata.
ManifoldCFException
ServiceInterruption
public void releaseDocumentVersions(java.lang.String[] documentIdentifiers,
java.lang.String[] versions)
throws ManifoldCFException
releaseDocumentVersions in interface IRepositoryConnectordocumentIdentifiers - is the set of document identifiers.versions - is the corresponding set of version identifiers (individual identifiers may be null).
ManifoldCFExceptionpublic int getMaxDocumentRequest()
getMaxDocumentRequest in interface IRepositoryConnector
public void processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
IProcessActivity activities,
DocumentSpecification spec,
boolean[] scanOnly,
int jobMode)
throws ManifoldCFException,
ServiceInterruption
processDocuments in interface IRepositoryConnectordocumentIdentifiers - is the set of document identifiers to process.versions - is the corresponding document versions to process, as returned by getDocumentVersions() above.
The implementation may choose to ignore this parameter and always process the current version.activities - is the interface this method should use to queue up new document references
and ingest documents.spec - is the document specification.scanOnly - is an array corresponding to the document identifiers. It is set to true to indicate when the processing
should only find other references, and should not actually call the ingestion methods.jobMode - is an integer describing how the job is being run, whether continuous or once-only.
ManifoldCFException
ServiceInterruption
public void processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
IProcessActivity activities,
DocumentSpecification spec,
boolean[] scanOnly)
throws ManifoldCFException,
ServiceInterruption
documentIdentifiers - is the set of document identifiers to process.versions - is the corresponding document versions to process, as returned by getDocumentVersions() above.
The implementation may choose to ignore this parameter and always process the current version.activities - is the interface this method should use to queue up new document references
and ingest documents.spec - is the document specification.scanOnly - is an array corresponding to the document identifiers. It is set to true to indicate when the processing
should only find other references, and should not actually call the ingestion methods.
ManifoldCFException
ServiceInterruption
public void outputSpecificationHeader(IHTTPOutput out,
DocumentSpecification ds,
java.util.ArrayList tabsArray)
throws ManifoldCFException,
java.io.IOException
outputSpecificationHeader in interface IRepositoryConnectorout - is the output to which any HTML should be sent.ds - is the current document specification for this job.tabsArray - is an array of tab names. Add to this array any tab names that are specific to the connector.
ManifoldCFException
java.io.IOException
public void outputSpecificationBody(IHTTPOutput out,
DocumentSpecification ds,
java.lang.String tabName)
throws ManifoldCFException,
java.io.IOException
public java.lang.String processSpecificationPost(IPostParameters variableContext,
DocumentSpecification ds)
throws ManifoldCFException
processSpecificationPost in interface IRepositoryConnectorvariableContext - contains the post data, including binary file-upload information.ds - is the current document specification for this job.
ManifoldCFException
public void viewSpecification(IHTTPOutput out,
DocumentSpecification ds)
throws ManifoldCFException,
java.io.IOException
viewSpecification in interface IRepositoryConnectorout - is the output to which any HTML should be sent.ds - is the current document specification for this job.
ManifoldCFException
java.io.IOException
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||