2017 © Pedro Peláez
 

library aicrawler-docs

Documentation for AiCrawler, AiResponses and AiScrapers projects.

image

dan/aicrawler-docs

Documentation for AiCrawler, AiResponses and AiScrapers projects.

  • Sunday, November 15, 2015
  • by dan
  • Repository
  • 0 Watchers
  • 0 Stars
  • 0 Installations
  • CSS
  • 0 Dependents
  • 0 Suggesters
  • 0 Forks
  • 0 Open issues
  • 1 Versions
  • 0 % Grown

The README.md

AiCrawler

Leverage Ai design patterns by using heuristics with the Symfony DOMCrawler., (*1)

Table of Contents

Quickstart

The AiCrawler package has the responsibility of making boolean assertions on a node in the HTML DOM. It comes with a straight-forward data point trait which will record the results of your heuristics (rules) for a given "item" or context., (*2)

Install with Composer

$ composer require dan/aicrawler dev-master

Trivial example

$crawler = new AiCrawler('<html>...</html>');

$node = $crawler->filter('div[id="content-start"]');
$args = ['words' => 15];

// Does the content have at least 15 words?
$assertion = Heuristics::words($node, $args); // true / false

A more expressive example

$crawler = new AiCrawler("<html>...</html>");

$args = [
    'elements' => [
        "elements" => "/p/ /blockquote/ /(u|o)l/ /h[1-6]/",
        "regex" => true,
        'words' => [
            'words' => 15,
            'descendants' => true,
            'words2' => [
                'words' => "/(cod(ing|ed|e)|program|language|php)/",
                'regex' => true,
                'descendants' => true
            ]
        ]
    ],
    'matches' => 3
]


/**
 * Do at least 3 of this div's children which are p, blockquote, ul, ol or any
 * h element AND contain at least 15 words (including text from the child's 
 * descendants) AND words such as coding, coded, code, program, language, php 
 * (including text from the child's descendants).
 */
$crawler->filter("div")->each(function(&$node) use ($args) {
    if (Heuristics::children($node, $args) {
        $node->setDataPoint("example", "words", 1);
    }
});

Sound interested? Continue reading, review the Heuristics class or go right to a similar example with complete notes., (*3)

The Versions

15/11 2015

dev-master

9999999-dev

Documentation for AiCrawler, AiResponses and AiScrapers projects.

  Sources   Download

MIT

crawler ai scraper