library yolo_crawler
An event based domain crawler
c4pone/yolo_crawler
An event based domain crawler
- Monday, March 2, 2015
- by floriank
- Repository
- 0 Watchers
- 2 Stars
- 39 Installations
- PHP
- 0 Dependents
- 0 Suggesters
- 1 Forks
- 0 Open issues
- 1 Versions
- 0 % Grown
yolo crawler
Status Label |
Status Value |
Build |
 |
Code Quality |
 |
find broken links example
require 'bootstrap/autoload.php';
use WP\Crawler\LinkFinder;
use WP\Crawler\DomainCrawler;
use WP\Crawler\Queue\QueueManager;
use WP\Crawler\Queue\ArrayQueue;
use WP\Crawler\Queue\Store\ArrayStore;
use WP\Crawler\Queue\Validator\ValidFileExtension;
use WP\Crawler\Queue\Validator\NoPseudoUrl;
use WP\Crawler\Event\LogSubscriber;
use WP\Crawler\Event\BrokenLinkFinderSubscriber;
use Symfony\Component\EventDispatcher\EventDispatcher;
if (isset($argv[1])) {
$domain = $argv[1];
$manager = new QueueManager(new ArrayQueue(), new ArrayStore());
$manager->addValidator(new NoPseudoUrl())
->addValidator(new ValidFileExtension());
$crawler = new DomainCrawler(
$manager,
new LinkFinder()
);
if (isset($argv[2]))
$crawler->setWaitTime($argv[2]);
$dispatcher = $crawler->getEventDispatcher();
$dispatcher->addSubscriber(new LogSubscriber);
$dispatcher->addSubscriber(new BrokenLinkFinderSubscriber);
$crawler->crawl($domain);
} else {
echo "\n";
echo ("Usage " . $argv[0] . ' {domain} {time to wait}' . "\n");
}
dev-master
9999999-dev
An event based domain crawler
Sources
Download
MIT
The Requires
The Development Requires
event
crawler
yolo