cms39_htmLawed
PHP code to purify & filter HTML, (*1)
make HTML markup in text secure and standard-compliant
process text for use in HTML, XHTML or XML documents
restrict HTML elements, attributes or URL protocols using black- or white-lists
balance tags, check element nesting, transform deprecated attributes and tags, make relative URLs absolute, etc.
fast, highly customizable, well-documented
single, 48 kb file
simple HTML Tidy alternative
free and licensed under LGPL v3 and GPL v2+
use to filter, secure & sanitize HTML in blog comments or forum posts, generate XML-compatible feed items from web-page excerpts, convert HTML to XHTML, pretty-print HTML, scrape web-pages, reduce spam, remove XSS code, etc., (*2)
htmLawed 1.2 beta change-log, (*3)
**** 1.2.beta.7, 19 January 2015 release ****, (*4)
Fix for a bug in cleaning of soft-hyphens in URL values, etc., (*5)
**** 1.2.beta.6, 2 August 2014 release ****, (*6)
Fix for a potential security vulnerability arising from specially encoded text with serial opening tags, (*7)
**** 1.2.beta.5, 12 March 2014 release ****, (*8)
Incorporated the one change made for htmLawed 1.1.17 for PHP 5.5 compatibility, (*9)
**** 1.2.beta.4, 11 September 2013 release ****, (*10)
Incorporated the changes made from htmLawed 1.1.14 to 1.1.16: improved Tidy functionality and detection of specially crafted URL protocols, (*11)
**** 1.2.beta.3, 6 August 2013 release ****, (*12)
Improved checking for valid nesting within the a element, (*13)
**** 1.2.beta.2, 9 June 2013 release ****, (*14)
Support for new (HTML5) element: main, (*15)
Support for HTML5's custom data-* attributes, (*16)
Checks that the value of config. parameter 'unique_ids' does not have a non-word character, (*17)
The value attribute of li, border of img, and start and type of ol are no longer considered as deprecated, (*18)
**** 1.2.beta.1, 2 June 2013 release ****, (*19)
Support for new (HTML5) element: hgroup, (*20)
Corrected checking for child elements allowed within a to comply with HTML5 specification, (*21)
**** 1.2.beta, 26 May 2013 release ****, (*22)
Support for new (HTML5) elements: article, aside, audio, bdi, canvas, command, data, datalist, details, figcaption, figure, footer, header, keygen, mark, meter, nav, output, progress, section, source, summary, time, track, video, wbr (of these, audio, canvas, and video are considered 'unsafe' elements), (*23)
Support for these existing (HTML4 'head') elements: link, meta and style (for use within HTML 'body'), (*24)
Changes in handling of previously supported elements, for compliance with current HTML(5) specification: embed, menu, and u are no longer deprecated (i.e., no tag transformation [when enforced]); acronym will get tag-transformed to abbr; big will get tag-transformed to span (with font-size:larger) and tt will get tag-transformed to code; address is not allowed within address; embed is not allowed with a or button; a is no longer considered an inline element, (*25)
Global attributes, for compliance with current HTML(5) specification: many new global attributes, including inert and translate, 35 WAI-ARIA attibutes like aria-busy, and the 5 item* attributes like itemscope for microdata specification; accesskey, tabindex, xml:space and xmlns are now global attributes; previous restrictions on global attributes (e.g., id not allowed in script) are removed; now 54 instead of 10 event attributes like onclick and oncuechange; the id attribute is now allowed to have any value, as long as it is unique (when enforced), is not an empty string, and does not contain space characters, (*26)
Support for new or previously deprecated attributes of previously supported elements, for compliance with current HTML(5) specification -- a: download, media, ping, target; area: hreflang, media, rel, target, type; button, input: formaction, formenctype, formmethod, formnovalidate, formtarget; button, input, select, textarea: autofocus, required; button, fieldset, input, label, object, select, textarea: form; embed: hspace, vspace; fieldset: disabled, name; form: novalidate; iframe: sandbox, seamless, srcdoc; img: crossorigin; input: autocomplete, height, list, max, min, multiple, pattern, step, width; input, textarea: dirname, placeholder; menu: type, label; object: typemustmatch; ol: reversed; script: async, type; textarea: maxlength, wrap, (*27)
Support for new values for the type attribute of input: tel, search, url, email, datetime, date, month, week, time, datetime-local, number, range, color., (*28)
Functions kses() and kses_hook() have been removed., (*29)