nz.ac.waikato.mcennis.rat.crawler.filter
Class StopCount

java.lang.Object
  extended by nz.ac.waikato.mcennis.rat.crawler.filter.StopCount
All Implemented Interfaces:
CrawlerFilter

public class StopCount
extends java.lang.Object
implements CrawlerFilter

Implements a filter with a global stop count that ceases retrieval after the given number of pages have been scheduled for retrieval. NOTE: This code is NOT re-entrant. Only one copy of this filter may be active at the same time. Note: the count is handled by a static variable as it must be able to handle a multi-threaded environment where the stop count must be shared across multiple different threads.


Constructor Summary
StopCount()
           
 
Method Summary
 void build(int stopCount)
          Create a filter that schedules for parsing the given number of pages until the count is reached, then rejects all scheduling requests.
 boolean check(java.lang.String site)
          Should the URL this string represents be retrieved
 boolean check(java.lang.String site, Properties parameters)
          Should the URL this string represents be retrieved, given the parameters provided
protected  boolean checkCount()
           
 void load(java.lang.String site)
          Submit the given site to the filter chain without retrieving it.
 void load(java.lang.String site, Properties parameters)
          Submit the given site - parameter combination to the filter chain without retrieving it.
 StopCount prototype()
          Creates a new default version of this class with no common data excepting static variables
 void resetCount()
          Resets the count to zero, allowing reuse of the filter in a later crawl.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StopCount

public StopCount()
Method Detail

check

public boolean check(java.lang.String site)
Description copied from interface: CrawlerFilter
Should the URL this string represents be retrieved

Specified by:
check in interface CrawlerFilter
Parameters:
site - URL of the site to be retrieved
Returns:
retrieve or not retrieve

check

public boolean check(java.lang.String site,
                     Properties parameters)
Description copied from interface: CrawlerFilter
Should the URL this string represents be retrieved, given the parameters provided

Specified by:
check in interface CrawlerFilter
Parameters:
site - URL to be retrieved
parameters - parameters governing the retrieval
Returns:
retrieve or not retrieve

checkCount

protected boolean checkCount()

build

public void build(int stopCount)
Create a filter that schedules for parsing the given number of pages until the count is reached, then rejects all scheduling requests.

Parameters:
stopCount - total number of pages before scheduling of pages stops

resetCount

public void resetCount()
Resets the count to zero, allowing reuse of the filter in a later crawl.


load

public void load(java.lang.String site)
Description copied from interface: CrawlerFilter
Submit the given site to the filter chain without retrieving it.

Specified by:
load in interface CrawlerFilter
Parameters:
site - URL to be added

load

public void load(java.lang.String site,
                 Properties parameters)
Description copied from interface: CrawlerFilter
Submit the given site - parameter combination to the filter chain without retrieving it.

Specified by:
load in interface CrawlerFilter
Parameters:
site - URL to be added

prototype

public StopCount prototype()
Description copied from interface: CrawlerFilter
Creates a new default version of this class with no common data excepting static variables

Specified by:
prototype in interface CrawlerFilter
Returns:
new filter of the same class as the parent

Get Relational Analysis Toolkit at SourceForge.net. Fast, secure and Free Open Source software downloads