Wallogit.com
2017 © Pedro Peláez
PHP Based Docx Parser, based on original work by PhilGale92/docx
This PHP based parser takes any docx file, and creates a PHP array containing its structure, content & style information. Simply import any style data (as demonstrated within index.php) using the word style name & any desired attributes and run the parser., (*1)
====, (*2)
Supports: - Word styles - Paragraphs - Text indentation / tabbing - Nested lists (& inline lists) - Tables (Vertical cell merging + colspans) - Images (& finding the required image size) - Hyperlinks (With mailto: support) - Bold / Underlined / Italic text - Textboxes (Parser support added, but not rendered) - Table of content functionality (You likely need to extend the docx class & modify the ->render() class), (*3)
====, (*4)
Known Bugs:, (*5)
| Example (incorrect render) |
|---|
| ------------- |
| cell 1 |
| cell 2 |
| cell 3 |
| cell 4 |
Cell 1 + 2 are vertically merged. Then there is a border, cell 3 + 4 are merged. The renderer cannot differentiate between multiple vertical merges that don't have a standard cell between them., (*6)
The following layout is fine, as cell 3 is a standard cell dividing the two vertical merges:, (*7)
| Example (works) |
|---|
| ------------- |
| cell 1 |
| cell 2 |
| -------- |
| cell 3 |
| cell 4 |
| cell 5 |
====, (*8)
Caveats:, (*9)
====, (*10)
Requirements:, (*11)