|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectnz.ac.waikato.mcennis.rat.crawler.CrawlerBase
nz.ac.waikato.mcennis.rat.crawler.WebCrawler.Spider
public class WebCrawler.Spider
Helper class that is used for threads. Crawls sites in order. Each crawler recieves an equal number f sites to crawl, not based on curent load.
Field Summary |
---|
Fields inherited from class nz.ac.waikato.mcennis.rat.crawler.CrawlerBase |
---|
parser |
Constructor Summary | |
---|---|
WebCrawler.Spider(WebCrawler p)
Base constructor that stores a reference to the parent in each thread |
Method Summary | |
---|---|
protected void |
add(SiteReference site)
add a new site to be crawled by this thread |
protected void |
doParse(java.io.InputStream raw_data,
Properties parsers)
Helper function separated from public parse to allow easy overloading. |
protected boolean |
isEmpty()
is this thread idle and waiting for more sites to crawl |
boolean |
isRunning()
return whether or not threads are actively crawling the Internet |
void |
run()
starts the thread executing, parsing web sites in its queue until it recieves a stop request. |
Methods inherited from class nz.ac.waikato.mcennis.rat.crawler.CrawlerBase |
---|
block, block, crawl, crawl, getCrawler, getFilter, getParsers, getProperties, getProxyHost, getProxyPort, getProxyType, isCaching, isSpidering, set, setCaching, setCrawler, setFilter, setParsers, setProxy, setProxyHost, setProxyPort, setProxyType, setSpidering |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public WebCrawler.Spider(WebCrawler p)
p
- parent which allows communication between thread and the
parent object.Method Detail |
---|
protected void add(SiteReference site)
entry
- Site to be crawledprotected boolean isEmpty()
public void run()
run
in interface java.lang.Runnable
public boolean isRunning()
protected void doParse(java.io.InputStream raw_data, Properties parsers) throws java.io.IOException, java.lang.Exception
CrawlerBase
doParse
in class CrawlerBase
java.io.IOException
java.lang.Exception
|
|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |