2017 © Pedro Peláez
 

library search-bot

Laravel package to crawl websites.

image

sukohi/search-bot

Laravel package to crawl websites.

  • Wednesday, February 15, 2017
  • by Sukohi
  • Repository
  • 1 Watchers
  • 1 Stars
  • 5 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 1 Forks
  • 0 Open issues
  • 8 Versions
  • 0 % Grown

The README.md

SearchBot

Laravel package to crawl websites.(Laravel 5+), (*1)

Requirements

Installation

Execute the next command., (*2)

composer require sukohi/search-bot:1.*

Set the service providers in app.php, (*3)

'providers' => [
    ...Others...,
    Sukohi\SearchBot\SearchBotServiceProvider::class,
    Sukohi\LaravelAbsoluteUrl\LaravelAbsoluteUrlServiceProvider::class, 
]

Also alias, (*4)

'aliases' => [
    ...Others...,
    'LaravelAbsoluteUrl' => Sukohi\LaravelAbsoluteUrl\Facades\LaravelAbsoluteUrl::class,
    'SearchBot' => Sukohi\SearchBot\Facades\SearchBot::class,
]

Then execute the next commands., (*5)

php artisan vendor:publish
php artisan migrate

Now you have config/search_bot.php which you can set domains restrictions., (*6)

Config

return [

    'main' => '*',
    'yahoo' => ['yahoo.com', 'www.yahoo.com'],
    'reddit' => ['www.reddit.com']

];
  • If you don't need to set restriction, set *.

Usage

$starting_url = 'http://yahoo.com';
$options = [
    'type' => 'main', // $type is optional.(Default: main),
    'url_deletion' => true  // Default: true
];
$result = \SearchBot::request($starting_url, $options);

if($result->exists()) {

    // Symfony\Component\BrowserKit\Response
    // See http://api.symfony.com/2.3/Symfony/Component/BrowserKit/Response.html
    $response = $result->response();

    // Symfony\Component\DomCrawler/Crawler
    // See http://api.symfony.com/2.3/Symfony/Component/DomCrawler/Crawler.html
    $crawler = $result->crawler();

    $result->links(function($url, $text){

        // All links including URL & text will come here.

    });

    $result->queues(function($crawler_queue, $url, $text){

        // All links that do not exist in DB will come here.
        // $crawler_queue has already type and url.
        $crawler_queue->save();

    });

} else {

    $e = $result->exception();
    echo $e->getMessage();
    $type = $result->type();
    $url = $result->url();

}

Options

  • type, (*7)

    Type is string that you can decide freely.
    Default is main., (*8)

  • url_deletion, (*9)

    If true here, URL accessed will be removed from DB.
    Default is true., (*10)

License

This package is licensed under the MIT License.
Copyright 2017 Sukohi Kuhoh, (*11)

The Versions

15/02 2017

dev-master

9999999-dev

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.x-dev

1.0.9999999.9999999-dev

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.5

1.0.5.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.4

1.0.4.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.3

1.0.3.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.2

1.0.2.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.1

1.0.1.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi

15/02 2017

1.0.0

1.0.0.0

Laravel package to crawl websites.

  Sources   Download

MIT

The Requires

 

by Avatar Sukohi