org.apache.manifoldcf.crawler.interfaces
Class PerformanceStatistics

java.lang.Object
  extended by org.apache.manifoldcf.crawler.interfaces.PerformanceStatistics

public class PerformanceStatistics
extends java.lang.Object

An instance of this class keeps a running average of how long it takes for every connection to process a document. This information is used to limit queuing per connection to something reasonable given the characteristics of the connection.


Nested Class Summary
protected static class PerformanceStatistics.AveragingQueue
          This class keeps track of some depth of fetch history for an individual connection, and is used to calculate a weighted average fetches-per-minute rate.
protected static class PerformanceStatistics.AveragingRecord
          This class contains the data for a single document set against the given connection
 
Field Summary
static java.lang.String _rcsid
           
protected  java.util.HashMap connectionHash
          This hash is keyed by the connection name, and has elements of type AveragingQueue
protected static double DEFAULT_FETCH_RATE
          This is the fetch rate that will be returned in the complete absence of any other information.
protected static long DEFAULT_FETCH_TIME
           
protected static double[] weights
          These are the weighting coefficients for the average.
 
Constructor Summary
PerformanceStatistics()
          Constructor
 
Method Summary
 double calculateConnectionFetchRate(java.lang.String connectionName)
          Obtain current average document fetch rate (in documents per minute per connection)
 void noteDocumentsCompleted(java.lang.String connectionName, int documentSetSize, long elapsedTime)
          Note the successful completion of a set of documents using a single connection, and record the statistics for them.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_rcsid

public static final java.lang.String _rcsid
See Also:
Constant Field Values

DEFAULT_FETCH_RATE

protected static double DEFAULT_FETCH_RATE
This is the fetch rate that will be returned in the complete absence of any other information. This represents a 'wild guess' of a sort, used only at the very start of a job, and designed to not hopelessly overload the queue with stuff from one connection only.


DEFAULT_FETCH_TIME

protected static long DEFAULT_FETCH_TIME

weights

protected static double[] weights
These are the weighting coefficients for the average. They should all add up to 1.0


connectionHash

protected java.util.HashMap connectionHash
This hash is keyed by the connection name, and has elements of type AveragingQueue

Constructor Detail

PerformanceStatistics

public PerformanceStatistics()
Constructor

Method Detail

noteDocumentsCompleted

public void noteDocumentsCompleted(java.lang.String connectionName,
                                   int documentSetSize,
                                   long elapsedTime)
Note the successful completion of a set of documents using a single connection, and record the statistics for them.


calculateConnectionFetchRate

public double calculateConnectionFetchRate(java.lang.String connectionName)
Obtain current average document fetch rate (in documents per minute per connection)