Method: PHPCrawler::addURLFollowRule()



Adds a rule to the list of rules that decide which URLs found on a page should be followd explicitly.
Signature:

public addURLFollowRule($regex)

Parameters:

$regex string Regular-expression defining the rule

Returns:

bool  TRUE if the regex is valid and the rule was added to the list, otherwise FALSE.

Description:

If the crawler finds an URL and this URL doesn't match with any of the given regular-expressions, the crawler
will ignore this URL and won't follow it.

NOTE: By default and if no rule was added to this list, the crawler will NOT filter ANY URLs, every URL the crawler finds
will be followed (except the ones "excluded" by other options of course).

Example:$crawler->addURLFollowRule("#(htm|html)$# i");
$crawler->addURLFollowRule("#(php|php3|php4|php5)$# i");

These rules let the crawler ONLY follow URLs/links that end with "html", "htm", "php", "php3" etc.