2017 © Pedro Peláez
 

library php-spider

A basic web spider written in PHP.

image

monkeyphp/php-spider

A basic web spider written in PHP.

  • Tuesday, June 26, 2018
  • by MonkeyPHP
  • Repository
  • 1 Watchers
  • 3 Stars
  • 39 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 2 Forks
  • 1 Open issues
  • 4 Versions
  • 18 % Grown

The README.md

PhpSpider

Latest Stable Version Total Downloads Latest Unstable Version License, (*1)

I needed a really basic spider to crawl a site to warm the cache; this is the result., (*2)

Examples

Simple

The simplest way to use PhpSpider is to construct an instance and pass the root url of the site that you wish to crawl to the Spider::crawl method., (*3)

By design, PhpSpider will only crawl pages that, (*4)

  • Return a content-type header of text\html
  • That are on the same domain as the supplied root
use PhpSpider\Spider\Spider;

$spider = new Spider();
$spider->crawl('https://www.example.com');

Advanced

If you need to override how PhpSpider works you can add event listeners to be notified, that can then affect how PhpSpider operates., (*5)

There are 5 events triggered by PhpSpider, (*6)

  • Spider::SPIDER_CRAWL_PRE - Triggered before PhpSpider starts to crawl a site.
  • Spider::SPIDER_CRAWL_POST - Triggered one PhpSpider has finished it's crawl.
  • Spider::SPIDER_CRAWL_PAGE_PRE - Triggered just before a page is crawled
  • Spider::SPIDER_CRAWL_PAGE_POST - Triggered once a page is crawled
  • Spider::SPIDER_CRAWL_PAGE_ERROR - Trigged if an error occurs whilst crawling a page

You can find examples in the examples directory included in this repository., (*7)

    $ php ./examples/example_0.php
use PhpSpider\Spider\Spider;
use Zend\EventManager\Event;

$listener function ($event) {
    $uri = $event->getParam('uri');
    echo $uri;
};

$spider = new Spider();
$spider->getEventManager()->attach(Spider::SPIDER_CRAWL_PAGE_PRE, $listener, 1000);

$spider->crawl('https://www.example.com');

To Run Tests

$ vendor/bin/phpunit -c tests/phpunit.xml

The Versions

26/06 2018

dev-master

9999999-dev https://github.com/monkeyphp/php-spider

A basic web spider written in PHP.

  Sources   Download

BSD-3-Clause

The Requires

 

The Development Requires

by David White

24/09 2016

dev-develop

dev-develop https://github.com/monkeyphp/php-spider

A basic web spider written in PHP.

  Sources   Download

BSD-3-Clause

The Requires

 

The Development Requires

by David White

24/09 2016
11/09 2016