2017 © Pedro Peláez
 

library chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

image

helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  • Friday, June 1, 2018
  • by helloiamlukas
  • Repository
  • 1 Watchers
  • 0 Stars
  • 317 Installations
  • PHP
  • 1 Dependents
  • 0 Suggesters
  • 2 Forks
  • 0 Open issues
  • 6 Versions
  • 10 % Grown

The README.md

A Chrome Headless wrapper for PHP

Build Status StyleCI, (*1)

Get the DOM of any webpage by using headless Chrome. Inspired by Browsershot., (*2)

Requirements

This package requires the Puppeteer Chrome Headless Node library. If you want to install it on Ubuntu 16.04 you can do it like this:, (*3)

sudo apt-get update
curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -
sudo apt-get install -y nodejs gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget
sudo npm install --global --unsafe-perm puppeteer
sudo chmod -R o+rx /usr/lib/node_modules/puppeteer/.local-chromium

Installation

To add this package to your project, you can install it via composer by running, (*4)

composer require helloiamlukas/chrome-php

Usage

Here is a quick example how to use this package:, (*5)

use ChromeHeadless\ChromeHeadless;

$html = ChromeHeadless::url('https://example.com')->getHtml();

Instead of getting the DOM as a string, you can also use thegetDOMCrawler() method, which will return a Symfony\Component\DomCrawler\Crawler instance., (*6)

use ChromeHeadless\ChromeHeadless;

$dom = ChromeHeadless::url('https://example.com')->getDOMCrawler();

$title = $dom->filter('title')->text();

This makes it easy to filter the DOM for specific elements. Check the full documentation here., (*7)

Timeout

You can specify a timeout after which the process will be killed. The timeout should be given in seconds., (*8)

ChromeHeadless::url('https://example.com')
                ->setTimeout(10)
                ->getDOMCrawler();

If the process runs out of time a Symfony\Component\Process\Exception\ProcessTimedOutException will be thrown., (*9)

Custom Chrome Path

You can specify a custom path to your Chrome installation., (*10)

ChromeHeadless::url('https://example.com')
                ->setChromePath('/path/to/chrome')
                ->getDOMCrawler();

Custom User Agent

You can specify a custom user agent. By default the standard Chrome Headless user agent will be used., (*11)

ChromeHeadless::url('https://example.com')
                ->setUserAgent('nice-user-agent')
                ->getDOMCrawler();

Custom Headers

You can specify custom headers which will be used for the request., (*12)

ChromeHeadless::url('https://example.com')
                ->setHeaders([
                    'DNT' => 1 // DO NOT TRACK
                ])
                ->getDOMCrawler();

Blacklist

You can specify a list of regular expressions for files that should not be loaded when you request a website. These expressions will be checked against the url of the file., (*13)

ChromeHeadless::url('https://example.com')
                ->setBlacklist([
                    'www.google-analytics.com',
                    'analytics.js'
                ])
                ->getDOMCrawler();

Viewport

You can specify a custom viewport that will be used when you make a request. By default the Chrome Headless standard of 800x600px will be used., (*14)

ChromeHeadless::url('https://example.com')
                ->setViewport([
                    'width' => 1920,
                    'height' => 1080
                ])
                ->getDOMCrawler();

Testing

You can run the tests by using, (*15)

composer test

The Versions

01/06 2018

dev-master

9999999-dev https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless

27/05 2018

v2.0

2.0.0.0 https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless

21/05 2018

v1.2

1.2.0.0 https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless

20/05 2018

v1.1

1.1.0.0 https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless

06/04 2018

v1.0

1.0.0.0 https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless

06/04 2018

dev-analysis-z302LG

dev-analysis-z302LG https://github.com/helloiamlukas/chrome-php

A PHP Wrapper for Chrome Headless. Get the DOM of any webpage.

  Sources   Download

MIT

The Requires

 

The Development Requires

dom chrome webpage headless