dev-master
9999999-devDocumentation for AiCrawler, AiResponses and AiScrapers projects.
MIT
by Dan Richards
crawler ai scraper
Wallogit.com
2017 © Pedro Peláez
Documentation for AiCrawler, AiResponses and AiScrapers projects.
Leverage Ai design patterns by using heuristics with the Symfony DOMCrawler., (*1)
The AiCrawler package has the responsibility of making boolean assertions on a node in the HTML DOM. It comes with a straight-forward data point trait which will record the results of your heuristics (rules) for a given "item" or context., (*2)
$ composer require dan/aicrawler dev-master
$crawler = new AiCrawler('<html>...</html>');
$node = $crawler->filter('div[id="content-start"]');
$args = ['words' => 15];
// Does the content have at least 15 words?
$assertion = Heuristics::words($node, $args); // true / false
$crawler = new AiCrawler("<html>...</html>");
$args = [
'elements' => [
"elements" => "/p/ /blockquote/ /(u|o)l/ /h[1-6]/",
"regex" => true,
'words' => [
'words' => 15,
'descendants' => true,
'words2' => [
'words' => "/(cod(ing|ed|e)|program|language|php)/",
'regex' => true,
'descendants' => true
]
]
],
'matches' => 3
]
/**
* Do at least 3 of this div's children which are p, blockquote, ul, ol or any
* h element AND contain at least 15 words (including text from the child's
* descendants) AND words such as coding, coded, code, program, language, php
* (including text from the child's descendants).
*/
$crawler->filter("div")->each(function(&$node) use ($args) {
if (Heuristics::children($node, $args) {
$node->setDataPoint("example", "words", 1);
}
});
Sound interested? Continue reading, review the Heuristics class or go right to a similar example with complete notes., (*3)
Documentation for AiCrawler, AiResponses and AiScrapers projects.
MIT
crawler ai scraper