|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher
public class ThrottledFetcher
This class uses httpclient to fetch stuff from webservers. However, it additionally controls the fetch rate in two ways: first, controlling the overall bandwidth used per server, and second, limiting the number of simultaneous open connections per server. An instance of this class would very probably need to have a lifetime consistent with the long-term nature of these values, and be static.
| Nested Class Summary | |
|---|---|
protected static class |
ThrottledFetcher.ConnectionBin
Connection pool for a bin. |
protected static class |
ThrottledFetcher.DataRecorder
This class takes care of recording data and results for posterity |
protected static class |
ThrottledFetcher.DataSession
Helper class for the above |
protected static class |
ThrottledFetcher.PoolException
Pool exception class |
protected static class |
ThrottledFetcher.SocketCreateThread
Create a secure socket in a thread, so that we can "give up" after a while if the socket fails to connect. |
protected static class |
ThrottledFetcher.ThrottleBin
Throttles for a bin. |
protected static class |
ThrottledFetcher.ThrottledConnection
Throttled connections. |
protected static class |
ThrottledFetcher.ThrottledInputstream
This class throttles an input stream based on the specified byte rate parameters. |
protected static class |
ThrottledFetcher.WaitException
Wait exception class |
protected static class |
ThrottledFetcher.WebSecureSocketFactory
HTTPClient secure socket factory, which implements SecureProtocolSocketFactory |
| Field Summary | |
|---|---|
static java.lang.String |
_rcsid
|
protected static java.util.HashMap |
connectionBins
This is the static pool of ConnectionBin's, keyed by bin name. |
protected static java.lang.String |
dataFileFolder
|
protected static ThrottledFetcher.DataRecorder |
dataRecorder
|
protected static java.lang.Integer |
poolLock
This global lock protects the "distributed pool" resource, and insures that a connection can get pulled out of all the right pools and wind up in only the hands of one thread. |
protected static int |
READ_CHUNK_LENGTH
The read chunk length |
protected static boolean |
recordEverything
This flag determines whether we record everything to the disk, as a means of doing a web snapshot |
protected static java.lang.String |
resultLogFile
|
protected static java.util.HashMap |
throttleBins
This is the static pool of ThrottleBin's, keyed by bin name. |
protected static long |
TIME_15MIN
|
protected static long |
TIME_1DAY
|
protected static long |
TIME_2HRS
|
protected static long |
TIME_5MIN
|
protected static long |
TIME_6HRS
|
| Constructor Summary | |
|---|---|
ThrottledFetcher()
Constructor. |
|
| Method Summary | |
|---|---|
static void |
flushIdleConnections()
Flush connections that have timed out from inactivity. |
static IThrottledConnection |
getConnection(java.lang.String protocol,
java.lang.String server,
int port,
PageCredentials authentication,
org.apache.manifoldcf.core.interfaces.IKeystoreManager trustStore,
ThrottleDescription throttleDescription,
java.lang.String[] binNames,
int connectionLimit)
Obtain a connection to specified protocol, server, and port. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String _rcsid
protected static final boolean recordEverything
protected static final long TIME_2HRS
protected static final long TIME_5MIN
protected static final long TIME_15MIN
protected static final long TIME_6HRS
protected static final long TIME_1DAY
protected static java.util.HashMap connectionBins
protected static java.util.HashMap throttleBins
protected static java.lang.Integer poolLock
protected static final int READ_CHUNK_LENGTH
protected static final java.lang.String resultLogFile
protected static final java.lang.String dataFileFolder
protected static ThrottledFetcher.DataRecorder dataRecorder
| Constructor Detail |
|---|
public ThrottledFetcher()
| Method Detail |
|---|
public static IThrottledConnection getConnection(java.lang.String protocol,
java.lang.String server,
int port,
PageCredentials authentication,
org.apache.manifoldcf.core.interfaces.IKeystoreManager trustStore,
ThrottleDescription throttleDescription,
java.lang.String[] binNames,
int connectionLimit)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
protocol - is the protocol, e.g. "http"server - is the server IP address, e.g. "10.32.65.1"port - is the port to connect to, e.g. 80. Pass -1 if the default port for the protocol is desired.authentication - is the page credentials object to use for the fetch. If null, no credentials are available.trustStore - is the current trust store in effect for the fetch.binNames - is the set of bins, in order, that should be used for throttling this connection.
Note that the bin names for a given IP address and port MUST be the same for every connection!
This must be enforced by whatever it is that builds the bins - it must do so given an IP and port.throttleDescription - is the description of all the throttling that should take place.connectionLimit - isthe maximum number of connections permitted.
org.apache.manifoldcf.core.interfaces.ManifoldCFException
public static void flushIdleConnections()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFException
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||