|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectnz.ac.waikato.mcennis.rat.crawler.filter.StopCount
public class StopCount
Implements a filter with a global stop count that ceases retrieval after the given number of pages have been scheduled for retrieval. NOTE: This code is NOT re-entrant. Only one copy of this filter may be active at the same time. Note: the count is handled by a static variable as it must be able to handle a multi-threaded environment where the stop count must be shared across multiple different threads.
| Constructor Summary | |
|---|---|
StopCount()
|
|
| Method Summary | |
|---|---|
void |
build(int stopCount)
Create a filter that schedules for parsing the given number of pages until the count is reached, then rejects all scheduling requests. |
boolean |
check(java.lang.String site)
Should the URL this string represents be retrieved |
boolean |
check(java.lang.String site,
Properties parameters)
Should the URL this string represents be retrieved, given the parameters provided |
protected boolean |
checkCount()
|
void |
load(java.lang.String site)
Submit the given site to the filter chain without retrieving it. |
void |
load(java.lang.String site,
Properties parameters)
Submit the given site - parameter combination to the filter chain without retrieving it. |
StopCount |
prototype()
Creates a new default version of this class with no common data excepting static variables |
void |
resetCount()
Resets the count to zero, allowing reuse of the filter in a later crawl. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public StopCount()
| Method Detail |
|---|
public boolean check(java.lang.String site)
CrawlerFilter
check in interface CrawlerFiltersite - URL of the site to be retrieved
public boolean check(java.lang.String site,
Properties parameters)
CrawlerFilter
check in interface CrawlerFiltersite - URL to be retrievedparameters - parameters governing the retrieval
protected boolean checkCount()
public void build(int stopCount)
stopCount - total number of pages before scheduling of pages stopspublic void resetCount()
public void load(java.lang.String site)
CrawlerFilter
load in interface CrawlerFiltersite - URL to be added
public void load(java.lang.String site,
Properties parameters)
CrawlerFilter
load in interface CrawlerFiltersite - URL to be addedpublic StopCount prototype()
CrawlerFilter
prototype in interface CrawlerFilter
|
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||