|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.apache.manifoldcf.core.connector.BaseConnector
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector
public class FileConnector
This is the "repository connector" for a file system. It's a relative of the share crawler, and should have comparable basic functionality, with the exception of the ability to use ActiveDirectory and look at other shares.
| Nested Class Summary | |
|---|---|
protected static class |
FileConnector.IdentifierStream
Document identifier stream. |
| Field Summary | |
|---|---|
static java.lang.String |
_rcsid
|
protected static java.lang.String[] |
activitiesList
|
protected static java.lang.String |
ACTIVITY_READ
|
protected static java.lang.String |
RELATIONSHIP_CHILD
|
| Fields inherited from class org.apache.manifoldcf.core.connector.BaseConnector |
|---|
currentContext, params |
| Fields inherited from interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnector |
|---|
JOBMODE_CONTINUOUS, JOBMODE_ONCEONLY, MODEL_ADD, MODEL_ADD_CHANGE, MODEL_ADD_CHANGE_DELETE, MODEL_ALL, MODEL_PARTIAL |
| Constructor Summary | |
|---|---|
FileConnector()
Constructor. |
|
| Method Summary | |
|---|---|
protected static boolean |
checkInclude(java.io.File file,
java.lang.String fileName,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification documentSpecification)
Check if a file or directory should be included, given a document specification. |
protected static boolean |
checkIngest(java.io.File file,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification documentSpecification)
Check if a file should be ingested, given a document specification. |
protected static boolean |
checkMatch(java.lang.String sourceMatch,
int sourceIndex,
java.lang.String match)
Check a match between two strings with wildcards. |
protected java.lang.String |
convertToURI(java.lang.String documentIdentifier)
Convert a document identifier to a URI. |
java.lang.String[] |
getActivitiesList()
List the activities we might report on. |
java.lang.String[] |
getBinNames(java.lang.String documentIdentifier)
For any given document, list the bins that it is a member of. |
org.apache.manifoldcf.crawler.interfaces.IDocumentIdentifierStream |
getDocumentIdentifiers(org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
long startTime,
long endTime)
Given a document specification, get either a list of starting document identifiers (seeds), or a list of changes (deltas), depending on whether this is a "crawled" connector or not. |
java.lang.String[] |
getDocumentVersions(java.lang.String[] documentIdentifiers,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec)
Get document versions given an array of document identifiers. |
java.lang.String |
getJSPFolder()
Return the path for the UI interface JSP elements. |
java.lang.String[] |
getRelationshipTypes()
Return the list of relationship types that this connector recognizes. |
protected static int |
matchSubPath(java.lang.String subPath,
java.lang.String fullPath)
Match a sub-path. |
void |
outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.lang.String tabName)
Output the configuration body section. |
void |
outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.util.ArrayList tabsArray)
Output the configuration header section. |
void |
outputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds,
java.lang.String tabName)
Output the specification body section. |
void |
outputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds,
java.util.ArrayList tabsArray)
Output the specification header section. |
protected static boolean |
processCheck(boolean caseSensitive,
java.lang.String sourceMatch,
int sourceIndex,
java.lang.String match,
int matchIndex)
Recursive worker method for checkMatch. |
java.lang.String |
processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
Process a configuration post. |
void |
processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
boolean[] scanOnly)
Process a set of documents. |
java.lang.String |
processSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds)
Process a specification post. |
void |
viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
View configuration. |
void |
viewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds)
View specification. |
| Methods inherited from class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector |
|---|
addSeedDocuments, addSeedDocuments, getConnectorModel, getDocumentIdentifiers, getDocumentVersions, getDocumentVersions, getDocumentVersions, getDocumentVersions, getMaxDocumentRequest, getRemainingDocumentIdentifiers, processDocuments, releaseDocumentVersions, requestInfo |
| Methods inherited from class org.apache.manifoldcf.core.connector.BaseConnector |
|---|
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, poll, setThreadContext |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.apache.manifoldcf.core.interfaces.IConnector |
|---|
check, clearThreadContext, connect, deinstall, disconnect, getConfiguration, install, poll, setThreadContext |
| Field Detail |
|---|
public static final java.lang.String _rcsid
protected static final java.lang.String ACTIVITY_READ
protected static final java.lang.String RELATIONSHIP_CHILD
protected static final java.lang.String[] activitiesList
| Constructor Detail |
|---|
public FileConnector()
| Method Detail |
|---|
public java.lang.String getJSPFolder()
public java.lang.String[] getRelationshipTypes()
getRelationshipTypes in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectorgetRelationshipTypes in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorpublic java.lang.String[] getActivitiesList()
getActivitiesList in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectorgetActivitiesList in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorpublic java.lang.String[] getBinNames(java.lang.String documentIdentifier)
getBinNames in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectorgetBinNames in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector
protected java.lang.String convertToURI(java.lang.String documentIdentifier)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
documentIdentifier - is the document identifier.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
public org.apache.manifoldcf.crawler.interfaces.IDocumentIdentifierStream getDocumentIdentifiers(org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
long startTime,
long endTime)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
getDocumentIdentifiers in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorspec - is a document specification (that comes from the job).startTime - is the beginning of the time range to consider, inclusive.endTime - is the end of the time range to consider, exclusive.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
public java.lang.String[] getDocumentVersions(java.lang.String[] documentIdentifiers,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
getDocumentVersions in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectordocumentIdentifiers - is the array of local document identifiers, as understood by this connector.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
public void processDocuments(java.lang.String[] documentIdentifiers,
java.lang.String[] versions,
org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification spec,
boolean[] scanOnly)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
processDocuments in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectordocumentIdentifiers - is the set of document identifiers to process.activities - is the interface this method should use to queue up new document references
and ingest documents.spec - is the document specification.scanOnly - is an array corresponding to the document identifiers. It is set to true to indicate when the processing
should only find other references, and should not actually call the ingestion methods.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
public void outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.util.ArrayList tabsArray)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
outputConfigurationHeader in interface org.apache.manifoldcf.core.interfaces.IConnectoroutputConfigurationHeader in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.out - is the output to which any HTML should be sent.parameters - are the configuration parameters, as they currently exist, for this connection being configured.tabsArray - is an array of tab names. Add to this array any tab names that are specific to the connector.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
public void outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.lang.String tabName)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
public java.lang.String processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
processConfigurationPost in interface org.apache.manifoldcf.core.interfaces.IConnectorprocessConfigurationPost in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.variableContext - is the set of variables available from the post, including binary file post information.parameters - are the configuration parameters, as they currently exist, for this connection being configured.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
public void viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
viewConfiguration in interface org.apache.manifoldcf.core.interfaces.IConnectorviewConfiguration in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.out - is the output to which any HTML should be sent.parameters - are the configuration parameters, as they currently exist, for this connection being configured.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
public void outputSpecificationHeader(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds,
java.util.ArrayList tabsArray)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
outputSpecificationHeader in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectoroutputSpecificationHeader in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorout - is the output to which any HTML should be sent.ds - is the current document specification for this job.tabsArray - is an array of tab names. Add to this array any tab names that are specific to the connector.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
public void outputSpecificationBody(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds,
java.lang.String tabName)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
public java.lang.String processSpecificationPost(org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
processSpecificationPost in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectorprocessSpecificationPost in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorvariableContext - contains the post data, including binary file-upload information.ds - is the current document specification for this job.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
public void viewSpecification(org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification ds)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
viewSpecification in interface org.apache.manifoldcf.crawler.interfaces.IRepositoryConnectorviewSpecification in class org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnectorout - is the output to which any HTML should be sent.ds - is the current document specification for this job.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
java.io.IOException
protected static boolean checkInclude(java.io.File file,
java.lang.String fileName,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification documentSpecification)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
fileName - is the canonical file name.documentSpecification - is the specification.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
protected static boolean checkIngest(java.io.File file,
org.apache.manifoldcf.crawler.interfaces.DocumentSpecification documentSpecification)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
file - is the file.documentSpecification - is the specification.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
protected static int matchSubPath(java.lang.String subPath,
java.lang.String fullPath)
subPath - is the sub path.fullPath - is the full path.
protected static boolean checkMatch(java.lang.String sourceMatch,
int sourceIndex,
java.lang.String match)
sourceMatch - is the expanded string (no wildcards)sourceIndex - is the starting point in the expanded string.match - is the wildcard-based string.
protected static boolean processCheck(boolean caseSensitive,
java.lang.String sourceMatch,
int sourceIndex,
java.lang.String match,
int matchIndex)
caseSensitive - is true if file names are case sensitive.sourceMatch - is the source string (w/o wildcards)sourceIndex - is the current point in the source string.match - is the match string (w/wildcards)matchIndex - is the current point in the match string.
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||