Wallogit.com
2017 © Pedro Pelรกez
Friendlier XPath extension of DOMDocument for those fluent in beloved XPath!
, (*1)
Bower: bower install wildhoney/xpath-document, (*2)
XPathDocument allows you to chain your query methods, allowing you to delve deeper into the DOM hierarchy with each iteration., (*3)
$posts = $xpathDocument->query('//div[@class="posts"]');
foreach ($posts as $post) {
$comments = $post->query('div[@class="comments"]');
}
Each query will return an instance of XPathDocument_Dom_List โ and this class implements Iterator, ArrayAccess and Countable, which gives you lots of useful methods for manipulating the node collection., (*4)
, (*5)
Typically XPathDocument_Dom_List will hold a collection of XPathDocument_Dom_Element instances โ but other instances are possible:, (*6)
XPathDocument_Dom_Element โ generic elements with values and attributes;XPathDocument_Dom_Attr โ specific for node attributes;XPathDocument_Dom_Text โ specific for text values of nodes;The latter two have a simple getText method for returning their values. However, XPathDocument_Dom_Element has the greatest flexibility., (*7)
With an instance of XPathDocument_Dom_Element you have the following methods:, (*8)
getText โ retrieve the value of the node;getHtml โ retrieve the HTML value of the node;getName โ retrieve the name of the node (span, div, etc...);getAttribute โ retrieve an attribute by its name;query โ use node as the context for further querying;Please see the Reddit.com example in the example/index.php which will demonstrate how simple it is to crawl websites with XPathDocument!, (*9)