WebScrapa
DISCLAIMER
This package was put together for learning purposes only., (*1)
, (*2)
, (*3)
A simple PHP web scraper package -, (*4)
Most websites do not offer the functionality to save a copy of the data which they display to your computer. The only option then is to manually copy and paste the data displayed by the website in your browser to a local file in your computer - a very tedious job which can take many hours or sometimes days to complete. Web Scraping is the technique of automating this process., (*5)
WebScrapa is a simple web scraper package written with PHP. It uses cURL to request and download a webpage. The downloaded webpage is converted to XML DOM object and XPath is used to navigate through elements in the XML DOM object., (*6)
Installation
composer require "florence/scrapa: v1.0"
Usage
-
Create an instance of the Scrap class:, (*7)
$url = 'https://www.youtubecom/JustinBieber/about';
$query = '//ul[@class="about-custom-links"]//a[@class="about-channel-link "]/@href';
$scrap = new Scrap($url, $query);
Learn about XPath and how to scrape the elements based on their tags and attributes, such as CSS classes and IDs.
https://goo.gl/Gjd3R3, (*8)
-
Use the toArrayScrapDOM method to get the results of your query in array format, (*9)
print_r($scrap->toArrayScrapDOM());
-
Use the toStringScrapDOM method to get the results of your query in string format, (*10)
print_r($scrap->toStringScrapDOM())
Run the example file
clone the repo, (*11)
git clone https://github.com/andela-fokosun/webscrapa
run, (*12)
composer install
from your terminal, run:, (*13)
php example.php
run tests, (*14)
vendor/bin/phpunit