library shellless
A PHP package to extract readable text from HTML.
sukohi/shellless
A PHP package to extract readable text from HTML.
- Monday, March 13, 2017
- by Sukohi
- Repository
- 1 Watchers
- 0 Stars
- 17 Installations
- PHP
- 0 Dependents
- 0 Suggesters
- 0 Forks
- 0 Open issues
- 3 Versions
- 13 % Grown
Shellless
A PHP package to extract readable text from HTML., (*1)
Installation
Execute the next command., (*2)
composer require sukohi/shellless:1.*
Usage
use Sukohi\Shellless\Shellless;
$html = file_get_contents('http://example.com/');
$shellless = new Shellless();
$result = $shellless->extract($html);
echo $result->title; // Page title
echo $result->best_text; // The longest text
echo $result->full_text; // Joined text if more than 100 characters length.
print_r($result->all_texts, true);
Options
$shellless->setOptions([
'join_step' => 5,
'min_text_length' => 100
]);
Algorithm
- Join close texts if less than 5 HTML tags between them.
- Pick up texts if more than 100 characters length.
License
This package is licensed under the MIT License.
Copyright 2017 Sukohi Kuhoh, (*3)
1.0.x-dev
1.0.9999999.9999999-dev
A PHP package to extract readable text from HTML.
Sources
Download
MIT
by
Sukohi
dev-master
9999999-dev
A PHP package to extract readable text from HTML.
Sources
Download
MIT
by
Sukohi
1.0.0
1.0.0.0
A PHP package to extract readable text from HTML.
Sources
Download
MIT
by
Sukohi