|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectnz.ac.waikato.mcennis.rat.crawler.CrawlerBase
nz.ac.waikato.mcennis.rat.crawler.WebCrawler.Spider
public class WebCrawler.Spider
Helper class that is used for threads. Crawls sites in order. Each crawler recieves an equal number f sites to crawl, not based on curent load.
| Field Summary |
|---|
| Fields inherited from class nz.ac.waikato.mcennis.rat.crawler.CrawlerBase |
|---|
cache, parser, proxy, spider |
| Constructor Summary | |
|---|---|
WebCrawler.Spider(WebCrawler p)
Base constructor that stores a reference to the parent in each thread |
|
| Method Summary | |
|---|---|
protected void |
add(WebCrawler.SiteReference site)
add a new site to be crawled by this thread |
protected void |
doParse(byte[] raw_data,
java.lang.String[] parsers)
Helper function separated from public parse to allow easy overloading. |
protected boolean |
isEmpty()
is this thread idle and waiting for more sites to crawl |
boolean |
isRunning()
|
void |
run()
starts the thread executing, parsing web sites in its queue until it recieves a stop request. |
| Methods inherited from class nz.ac.waikato.mcennis.rat.crawler.CrawlerBase |
|---|
crawl, crawl, getParser, getProxyHost, getProxyPort, getProxyType, isCaching, isSpidering, set, setCaching, setProxy, setProxyHost, setProxyPort, setProxyType, setSpidering |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public WebCrawler.Spider(WebCrawler p)
p - parent which allows communication between thread and the
parent object.| Method Detail |
|---|
protected void add(WebCrawler.SiteReference site)
entry - Site to be crawledprotected boolean isEmpty()
public void run()
run in interface java.lang.Runnablepublic boolean isRunning()
protected void doParse(byte[] raw_data,
java.lang.String[] parsers)
throws java.io.IOException,
java.lang.Exception
CrawlerBase
doParse in class CrawlerBasejava.io.IOException
java.lang.Exception
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||