|
Holger's Java API |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.antelmann.net.SampleCrawlerSetting
public class SampleCrawlerSetting
SampleCrawlerSetting is what it's named: a sample CrawlerSetting. It is currently used by JSpider as the default CrawlerSetting.
JSpider,
Serialized Form| Field Summary | |
|---|---|
boolean |
currentSiteOnly
|
static String[] |
defaultRestrictURLPattern
|
int |
depth
|
boolean |
includeHTMLCode
|
String[] |
includeTextPattern
|
String[] |
restrictURLPattern
|
| Constructor Summary | |
|---|---|
SampleCrawlerSetting()
searches all files 3 levels deep in current site only |
|
SampleCrawlerSetting(int depth,
boolean currentSiteOnly,
String[] restrictURLPattern,
String[] includeTextPattern,
boolean includeHTMLCode)
|
|
SampleCrawlerSetting(int depth,
String includeTextPattern)
|
|
| Method Summary | |
|---|---|
boolean |
followLinks(URL url,
URL referer,
int depth,
List<URL> resultURLList,
List<URL> closedURLList,
List<Spider.URLWrapper> searchURLWrapperList)
followLinks() determines whether the given URL is to be searched for its links to be examined further in the next level. |
boolean |
isActive()
if inactive, followLinks() always returns false |
boolean |
matchesCriteria(URL url,
URL referer,
int depth,
List<URL> resultURLList,
List<URL> closedURLList)
This method decides whether either the URL itself or its content qualifies for what this CrawlerSetting searches for; as this function is also called on every URL encountered, it is also the place for any custom parsing this CrawlerSetting wants to do. |
void |
setActive(boolean flag)
|
String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final String[] defaultRestrictURLPattern
public int depth
public boolean currentSiteOnly
public String[] restrictURLPattern
public String[] includeTextPattern
public boolean includeHTMLCode
| Constructor Detail |
|---|
public SampleCrawlerSetting()
public SampleCrawlerSetting(int depth,
String includeTextPattern)
public SampleCrawlerSetting(int depth,
boolean currentSiteOnly,
String[] restrictURLPattern,
String[] includeTextPattern,
boolean includeHTMLCode)
| Method Detail |
|---|
public void setActive(boolean flag)
public boolean isActive()
public boolean followLinks(URL url,
URL referer,
int depth,
List<URL> resultURLList,
List<URL> closedURLList,
List<Spider.URLWrapper> searchURLWrapperList)
CrawlerSetting
followLinks in interface CrawlerSettingurl - the URL that is to be examined for its linksreferer - url's referer URLdepth - distance from the original root URL where the search beganresultURLList - List of URLs that have already been found to match this CrawlerSetting's criteriaclosedURLList - List of URLs that have already been found not to match the CrawlerSetting's criteriasearchURLWrapperList - List of Spider.URLWrapper objects already identified to be examined in the next levelSpider.URLWrapper
public boolean matchesCriteria(URL url,
URL referer,
int depth,
List<URL> resultURLList,
List<URL> closedURLList)
CrawlerSetting
matchesCriteria in interface CrawlerSettingurl - the URL in question to satisfy the criteriareferer - url's referer URLdepth - link distance from the original root URL where the search beganresultURLList - List of URLs that have already been found to match this CrawlerSetting's criteriaclosedURLList - List of URLs that have already been found not to match the CrawlerSetting's criteriapublic String toString()
toString in class Object
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||