|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.apache.manifoldcf.crawler.connectors.rss.ThrottledFetcher
public class ThrottledFetcher
This class uses httpclient to fetch stuff from webservers. However, it additionally controls the fetch rate in two ways: first, controlling the overall bandwidth used per server, and second, limiting the number of simultaneous open connections per server. It's also capable of limiting the maximum number of fetches per time period per server as well; however, this functionality is not strictly necessary at this time because the CF scheduler does that at a higher layer. An instance of this class would very probably need to have a lifetime consistent with the long-term nature of these values, and be static. This class sets up a different Http connection pool for each server, so that we can foist off onto the httpclient library the task of limiting the number of connections. This means that we need periodic polling to determine when idle pooled connections can be freed.
| Nested Class Summary | |
|---|---|
protected static class |
ThrottledFetcher.DataRecorder
This class takes care of recording data and results for posterity |
protected static class |
ThrottledFetcher.DataSession
Helper class for the above |
protected class |
ThrottledFetcher.Server
This class represents the throttling stuff kept around for a single server. |
protected static class |
ThrottledFetcher.ThrottledConnection
This class represents an established connection to a URL. |
protected static class |
ThrottledFetcher.ThrottledInputstream
This class throttles an input stream based on the specified byte rate parameters. |
| Field Summary | |
|---|---|
static java.lang.String |
_rcsid
|
protected static java.lang.String |
dataFileFolder
|
protected static ThrottledFetcher.DataRecorder |
dataRecorder
|
protected static int |
globalHandleCount
This counter keeps track of the total outstanding handles across everything, because we do try to control that |
protected static java.lang.Integer |
globalHandleCounterLock
This is the lock object for that global handle counter |
protected static int |
READ_CHUNK_LENGTH
The read chunk length |
protected static boolean |
recordEverything
This flag determines whether we record everything to the disk, as a means of doing a web snapshot |
protected int |
refCount
Reference count for how many connections to this pool there are |
protected static java.lang.String |
resultLogFile
|
protected java.util.Map |
serverMap
This hash maps the server string (without port) to a server object, where we can track the statistics and make sure we throttle appropriately |
| Constructor Summary | |
|---|---|
ThrottledFetcher()
Constructor. |
|
| Method Summary | |
|---|---|
IThrottledConnection |
createConnection(java.lang.String serverName,
double minimumMillisecondsPerBytePerServer,
int maxOpenConnectionsPerServer,
long minimumMillisecondsPerFetchPerServer,
int connectionLimit,
int connectionTimeoutMilliseconds)
Establish a connection to a specified URL. |
void |
noteConnectionEstablished()
Note that there is a repository connection that is using this object. |
void |
noteConnectionReleased()
Connection pool no longer needed. |
void |
poll()
Poll. |
protected static void |
registerGlobalHandle(int maxHandles)
Note that we're about to need a handle (and make sure we have enough) |
protected static void |
releaseGlobalHandle()
Note that we're done with a handle (so we can free it) |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String _rcsid
protected static final boolean recordEverything
protected static final int READ_CHUNK_LENGTH
protected static int globalHandleCount
protected static java.lang.Integer globalHandleCounterLock
protected java.util.Map serverMap
protected int refCount
protected static final java.lang.String resultLogFile
protected static final java.lang.String dataFileFolder
protected static ThrottledFetcher.DataRecorder dataRecorder
| Constructor Detail |
|---|
public ThrottledFetcher()
| Method Detail |
|---|
protected static void registerGlobalHandle(int maxHandles)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionprotected static void releaseGlobalHandle()
public IThrottledConnection createConnection(java.lang.String serverName,
double minimumMillisecondsPerBytePerServer,
int maxOpenConnectionsPerServer,
long minimumMillisecondsPerFetchPerServer,
int connectionLimit,
int connectionTimeoutMilliseconds)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
serverName - is the FQDN of the server, e.g. foo.metacarta.comminimumMillisecondsPerBytePerServer - is the average number of milliseconds to wait
between bytes, on
average, over all streams reading from this server. That means that the
stream will block on fetch until the number of bytes being fetched, done
in the average time interval required for that fetch, would not exceed
the desired bandwidth.minimumMillisecondsPerFetchPerServer - is the number of milliseconds
between fetches, as a minimum, on a per-server basis. Set
to zero for no limit.maxOpenConnectionsPerServer - is the maximum number of open connections to allow for a single server.
If more than this number of connections would need to be open, then this connection request will block
until this number will no longer be exceeded.connectionLimit - is the maximum desired outstanding connections at any one time.connectionTimeoutMilliseconds - is the number of milliseconds to wait for the connection before timing out.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
public void poll()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic void noteConnectionEstablished()
public void noteConnectionReleased()
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||