2017 © Pedro Peláez
 

library html

Ariadne Component Library: html writer and parser Component

image

arc/html

Ariadne Component Library: html writer and parser Component

  • Thursday, January 28, 2016
  • by Auke
  • Repository
  • 2 Watchers
  • 0 Stars
  • 64 Installations
  • PHP
  • 1 Dependents
  • 0 Suggesters
  • 3 Forks
  • 0 Open issues
  • 5 Versions
  • 0 % Grown

The README.md

Scrutinizer Code Quality Code Coverage Latest Stable Version Total Downloads Latest Unstable Version License, (*1)

arc/html

This component provides a unified html parser and writer. The writer allows for readable and correct html in code, not using templates. The parser is a wrapper around both DOMDocument and SimpleXML., (*2)

The parser and writer also work on fragments of HTML. The parser also makes sure that the output is identical to the input. When converting a node to a string, \arc\html will return the full html string, including tags. If you don't want that, you can always access the 'nodeValue' property to get the original SimpleXMLElement., (*3)

Finally the parser also adds the ability to use basic CSS selectors to find elements in the HTML., (*4)

    use \arc\html as h;
    $htmlString = h::doctype()
     .h::html(
        h::head(
            h::title('Example site')
        ),
        h::body(
            ['class' => 'homepage'],
            h::h1('An example site')
        )
     );
    $html = \arc\html::parse($htmlString);
    $title = $html->head->title->nodeValue; // SimpleXMLElement 'Example site'
    $titleTag = $html->head->title; // <title>Example site</title>

CSS selectors

    $title = current($html->find('title'));

The find() method always returns an array, which may be empty. By using current() you get the first element found, or null if nothing was found., (*5)

The following CSS selectors are supported:, (*6)

  • tag1 tag2
    This matches tag2 which is a descendant of tag1.
  • tag1 > tag2
    This matches tag2 which is a direct child of tag1.
  • tag:first-child
    This matches tag only if its the first child.
  • tag1 + tag2
    This matches tag2 only if its immediately preceded by tag1.
  • tag1 ~ tag2
    This matches tag2 only if it has a previous sibling tag1.
  • tag[attr]
    This matches tag if it has the attribute attr.
  • tag[attr="foo"]
    This matches tag if it has the attribute attr with the value foo in its value list.
  • tag#id
    This matches any tag with id id.
  • #id
    This matches any element with id id.
  • tag.class-name
    Matches any tag with a class class-name.
  • .class-name
    Matches any element with a class class-name.

SimpleXML

The parsed HTML behaves almost identical to a SimpleXMLElement, with the exceptions noted above. So you can access attributes just like SimpleXMLElement allows:, (*7)

    $class = $html->html->body['class'];
    $class = $html->html->body->attributes('version');

You can walk through the node tree:, (*8)

    $title = $html->html->head->title;

Any method or property available in SimpleXMLElement is included in \arc\html parsed data., (*9)

DOMElement

In addition to SimpleXMLElement methods, you can also call any method and most properties available in DOMElement., (*10)

    $class = $html->html->body->getAttributes('class');
    $title = current($html->getElementsByTagName('title'));

Parsing fragments

The arc\html parser also accepts partial HTML content. It doesn't require a single root element., (*11)

    $htmlString = <<< EOF
<li>
    <a href="anitem/">An item</a>
</li>
<li>
    <a href="anotheritem/">Another item</a>
</li>
EOF;
    $html = \arc\html::parse($htmlString);
    $links = $html->find('a');

And when you convert the html back to a string, it will still be a partial HTML fragment., (*12)

If you parse a single HTML tag, other than <html>, you must still reference this element to access it:, (*13)

    $htmlString = <<< EOF
<ul>
    <li>
        <a href="anitem/">An item</a>
    </li>
    <li>
        <a href="anotheritem/">Another item</a>
    </li>
</ul>
EOF;
    $html = \arc\html::parse($htmlString);
    $ul = $html->ul;

Why use this instead of DOMDocument or SimpleXML?

arc\html::parse has the following differences:, (*14)

  • When converted to string, it returns the original HTML, without additions you didn't make.
  • You can use it with partial HTML fragments.
  • No need to remember calling importNode() before appendChild() or insertBefore()
  • No need to switch between SimpleXML and DOMDocument, because you need that one method only available in the other API.
  • When returning a list of elements, you always get a simple Array, not a magic NodeList.

In addition arc\html doubles as a simple way to generate valid and indented HTML, with readable and self-validating code., (*15)

The Versions

28/01 2016

dev-master

9999999-dev https://github.com/Ariadne-CMS/arc/wiki

Ariadne Component Library: html writer and parser Component

  Sources   Download

MIT

The Requires

 

The Development Requires

by Auke van Slooten

component components

28/01 2016

1.0.1

1.0.1.0 https://github.com/Ariadne-CMS/arc/wiki

Ariadne Component Library: html writer and parser Component

  Sources   Download

MIT

The Requires

 

The Development Requires

by Auke van Slooten

component components

24/01 2016

1.0

1.0.0.0 https://github.com/Ariadne-CMS/arc/wiki

Ariadne Component Library: html writer and parser Component

  Sources   Download

MIT

The Requires

 

The Development Requires

by Auke van Slooten

component components

23/01 2016

0.9

0.9.0.0 https://github.com/Ariadne-CMS/arc/wiki

Ariadne Component Library: html writer and parser Component

  Sources   Download

MIT

The Requires

 

The Development Requires

by Auke van Slooten

component components

17/01 2016

0.1

0.1.0.0 https://github.com/Ariadne-CMS/arc/wiki

Ariadne Component Library: html writer and parser Component

  Sources   Download

MIT

The Requires

 

The Development Requires

by Auke van Slooten

component components