public class Http extends HttpBase
| Modifier and Type | Field and Description |
|---|---|
static org.slf4j.Logger |
LOG |
accept, acceptLanguage, BUFFER_SIZE, maxContent, maxCrawlDelay, proxyHost, proxyPort, RESPONSE_TIME, responseTime, timeout, useHttp11, useProxy, userAgentCHECK_BLOCKING, CHECK_ROBOTS, X_POINT_ID| Constructor and Description |
|---|
Http()
Constructs this plugin.
|
| Modifier and Type | Method and Description |
|---|---|
protected Response |
getResponse(URL url,
CrawlDatum datum,
boolean redirect)
Fetches the
url with a configured HTTP client and
gets the response. |
static void |
main(String[] args)
Main method.
|
void |
setConf(org.apache.hadoop.conf.Configuration conf)
Reads the configuration from the Nutch configuration files and sets
the configuration.
|
getAccept, getAcceptLanguage, getConf, getMaxContent, getProtocolOutput, getProxyHost, getProxyPort, getRobotRules, getTimeout, getUseHttp11, getUserAgent, logConf, main, processDeflateEncoded, processGzipEncoded, useProxypublic void setConf(org.apache.hadoop.conf.Configuration conf)
public static void main(String[] args) throws Exception
args - Command line argumentsExceptionprotected Response getResponse(URL url, CrawlDatum datum, boolean redirect) throws ProtocolException, IOException
url with a configured HTTP client and
gets the response.getResponse in class HttpBaseurl - URL to be fetcheddatum - Crawl dataredirect - Follow redirects if and only if trueProtocolExceptionIOExceptionCopyright © 2014 The Apache Software Foundation