php-simple-html-purify
A php simple html purify. This library doesn't apply any HTML specification. You should configure all rules by yourself., (*1)
install
composer require rokety/php-simple-html-purify, (*2)
How it works
+-----------+
| dirtyHtml |
+-----+-----+
|
+------------------> | <-------------------+
| | |
| +-------------v---------------+ |
| | apply Tag BlackList rules | |
| | | |
| | | |
No| | | |
| | | |
| | apply Tag WhiteList rules | |
| | | |
| +-------------+---------------+ |
| | |
| | |
| + |
+------------+if tag was keep |
+ |
| Yes |
+---------------v-----------------+ |
| apply Attribute BlackList rules | |
| | |
| | |
| | |
| apply Attribute WhiteList rules | |
| | |
+---------------+-----------------+ |
| |
| |
+ |
+-----------+if attribute was keep |
| + |
| | Yes |
| +----------------v-------------------+ |
| |apply AttributeValue BalckList rules| |
No| | | |
| | | |
| | | |
| |apply AttributeValue WhiteList rules| |
| +----------------+-------------------+ |
| | |
| | |
| v |
| +---------+---------+ |
+---------^+ collect valid tag | |
+---------+---------+ |
| | No
| |
+ |
if all tags has been purify +------+
+
| Yes
|
+---------v----------+
| generate cleanHtml |
+--------------------+
example
Filter tag:, (*3)
<?php
use PHPSimpleHtmlPurify\Purifier;
use PHPSimpleHtmlPurify\Tag;
require './vendor/autoload.php';
$dirtyHtml = '
';
$htmlPurifier = new Purifier();
$htmlPurifier->tagBlackList(new Tag('script'));//add script to tag blacklist rules
echo $htmlPurifier->purify($dirtyHtml);//output:
$htmlPurifier = new Purifier();
$htmlPurifier->tagWhiteList(new Tag(['p', 'div']));//add p, div to tag whitelist rules
echo $htmlPurifier->purify($dirtyHtml);//output:
//tag name also support regular expression, see source directory tests/*Test.php
Filter attribute:, (*7)
$dirtyHtml = '
';
$htmlPurifier = new Purifier();
$htmlPurifier->attrBlackList(new Attribute('class'));//add class to attribute blacklist, apply to all tag
echo $htmlPurifier->purify($dirtyHtml);//output:
$dirtyHtml = '
';
$htmlPurifier = new Purifier();
$htmlPurifier->attrWhiteList(new Attribute('style', false, new Tag('div')));//add style to attribute whitelist, apply to div tag
echo $htmlPurifier->purify($dirtyHtml);//output:
//attribute name also support regular expression, see source directory tests/*Test.php
Filter attribute value:, (*12)
$dirtyHtml = '
';
$htmlPurifier = new Purifier();
$htmlPurifier->attrValueBlackList(new AttributeValue('/position *: *absolute;?/', true, new Attribute('style')));//add style to attributeValue blacklist, apply to all tag
echo $htmlPurifier->purify($dirtyHtml);//output:
$dirtyHtml = '
';
$htmlPurifier = new Purifier();
$htmlPurifier->attrValueWhiteList(new AttributeValue(['/color: *#\d+;?/', '/font-size: *\d+px;?/'], true, new Attribute('style', false, new Tag('div'))));//add style to attribute whitelist, apply to div tag
echo $htmlPurifier->purify($dirtyHtml);//output:
For more use case, see source directory tests/*Test.php., (*17)