gufy/pdftohtml-php

PDF to HTML converter with PHP using Poppler-utils

Monday, March 6, 2017
by mgufron
Repository
11 Watchers
105 Stars
29,424 Installations

PHP
0 Dependents
0 Suggesters
52 Forks
27 Open issues
10 Versions
14 % Grown

The README.md

, _(*1)

PDF to HTML PHP Class

This class brought to you so you can use php and poppler-utils convert your pdf files to html file, _(*2)

Important Notes

Please see how to use below, since it's really upgraded and things in this package has already changed., _(*3)

Installation

When you are in your active directory apps, you can just run this command to add this package on your app, _(*4)

    composer require gufy/pdftohtml-php:~2

Or add this package to your composer.json, _(*5)

{
    "gufy/pdftohtml-php":"~2"
}

Requirements

Poppler-Utils (if you are using Ubuntu Distro, just install it from apt ) sudo apt-get install poppler-utils
PHP Configuration with shell access enabled

Usage

Here is the sample., _(*6)

html();

// convert a specific page to html string
$page = $pdf->html(3);

// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$dom = $pdf->getDom();

// check if your pdf has more than one pages
$total_pages = $pdf->getPages();

// Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3
$dom->goToPage(3);

// and then you can do as you please with that dom, you can find any element you want
$paragraphs = $dom->find('body > p');

// change pdftohtml bin location
\Gufy\PdfToHtml\Config::set('pdftohtml.bin', '/usr/local/bin/pdftohtml');

// change pdfinfo bin location
\Gufy\PdfToHtml\Config::set('pdfinfo.bin', '/usr/local/bin/pdfinfo');
?>

Passing options to getDOM

By default getDom() extracts all images and creates a html file per page. You can pass options when extracting html:, _(*7)

<?php
$pdfDom = $pdf->getDom(['ignoreImages' => true]);

Available Options

singlePage, default: false
imageJpeg, default: false
ignoreImages, default: false
zoom, default: 1.5
noFrames, default: true

Usage note for Windows Users

For those who need this package in windows, there is a way. First download poppler-utils for windows here http://blog.alivate.com.au/poppler-windows/. And download the latest binary., _(*8)

After download it, extract it. There will be a directory called bin. We will need this one. Then change your code like this, _(*9)

html();

// check if your pdf has more than one pages
$total_pages = $pdf->getPages();

// Your pdf happen to have more than one pages and you want to go another page? Got it. use this command to change the current page to page 3
$html->goToPage(3);

// and then you can do as you please with that dom, you can find any element you want
$paragraphs = $html->find('body > p');

?>

Usage note for OS/X Users

Thanks to @kaleidoscopique for giving a try and make it run on OS/X for this package, _(*10)

1. Install brew, _(*11)

Brew is a famous package manager on OS/X : http://brew.sh/ (aptitude style)., _(*12)

2. Install poppler, _(*13)

brew install poppler

3. Verify the path of pdfinfo and pdftohtml, _(*14)

$ which pdfinfo
/usr/local/bin/pdfinfo

$ which pdftohtml
/usr/local/bin/pdfinfo

4. Whatever the paths are, use Gufy\PdfToHtml\Config::set to set them in your php code. Obviously, use the same path as the one given by the which command;, _(*15)

html();
?>

Feedback & Contribute

Send me an issue for improvement or any buggy thing. I love to help and solve another people problems. Thanks :+1:, _(*16)

The Versions

06/03 2017

dev-master

9999999-dev

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

The Development Requires

by Mochamad Gufron

11/10 2016

v2.0.8

2.0.8.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

The Development Requires

by Mochamad Gufron

31/08 2016

v2.0.7

2.0.7.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

The Development Requires

by Mochamad Gufron

27/04 2016

v2.0.6

2.0.6.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

The Development Requires

by Mochamad Gufron

13/02 2016

v2.0.5

2.0.5.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

The Development Requires

phpunit/phpunit ~4

by Mochamad Gufron

11/08 2015

v2.0.4

2.0.4.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

by Mochamad Gufron

04/08 2015

v2.0.3

2.0.3.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

by Mochamad Gufron

24/07 2015

v2.0.2

2.0.2.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

by Mochamad Gufron

23/07 2015

v2.0.1

2.0.1.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

paquettg/php-html-parser ~1

by Mochamad Gufron

23/07 2015

v2.0.0

2.0.0.0

PDF to HTML converter with PHP using Poppler-utils

Sources Download

MIT

The Requires

paquettg/php-html-parser ~1

library pdftohtml-php

PDF to HTML converter with PHP using Poppler-utils

gufy/pdftohtml-php

The README.md

PDF to HTML PHP Class

Important Notes

Installation

Requirements

Usage

Passing options to getDOM

Available Options

Usage note for Windows Users

Usage note for OS/X Users

Feedback & Contribute

The Versions

dev-master

The Requires

The Development Requires

by Mochamad Gufron

v2.0.8

The Requires

The Development Requires

by Mochamad Gufron

v2.0.7

The Requires

The Development Requires

by Mochamad Gufron

v2.0.6

The Requires

The Development Requires

by Mochamad Gufron

v2.0.5

The Requires

The Development Requires

by Mochamad Gufron

v2.0.4

The Requires

by Mochamad Gufron

v2.0.3

The Requires

by Mochamad Gufron

v2.0.2

The Requires

by Mochamad Gufron

v2.0.1

The Requires

by Mochamad Gufron

v2.0.0

The Requires

by Mochamad Gufron