2017 © Pedro Peláez
 

library php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

image

dpolocalbrycej/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  • Thursday, April 12, 2018
  • by DPOLocalBryceJ
  • Repository
  • 1 Watchers
  • 0 Stars
  • 3 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 11 Forks
  • 0 Open issues
  • 6 Versions
  • 0 % Grown

The README.md

Unstructured Text Parser [PHP]

About this Class

This is a PHP Class to help extract text out of documents that are not structured in a processing friendly way. When you want to parse text out of form generated emails for example you can create a template matching the expected incoming mail format while specifying the variable text elements and leave the rest for the class to extract your preformatted variables out of the incoming mails' body text., (*1)

Useful when you want to parse data out of: * Emails generated from web forms * Documents with definable templates / expressions, (*2)

Installation

1- Using composer simply run the following:

$ composer require dpolocalbrycej/php-unstructured-text-parser

2- Clone / Copy the files from this repository to your local libs directory:

$ git clone https://github.com/dpolocalbrycej/php-unstructured-text-parser.git

Usage example

<?php
include_once __DIR__ . '/../vendor/autoload.php';

$parser = new dpolocalbrycej\UnstructuredTextParser\TextParser('/path/to/templatesDirectory');

$textToParse = 'Text to be parsed fetched from a file, mail, web service, or even added directly to the a string variable like this';

print_r($parser->parseText($textToParse)); //performs brute force parsing against all available templates

print_r($parser->parseText($textToParse, true)); //slower, performs a similarity check on available templates before parsing

Parsing Procedure

1- Grab a single copy of the text you want to parse., (*3)

2- Replace every single varying text within it to a named variable in the form of {%VariableName%}, (*4)

3- Add the templates file into the templates directory (defined in parsing code) with a txt extension fileName.txt, (*5)

4- Pass the text you wish to parse to the parse method of the class and let it do the magic for you., (*6)

Template Example

If the text documents you want to parse looks like this:, (*7)

Hi GitHub-er,
If you wish to parse message coming from a website that states info like:
Name: Pet Cat
E-Mail: email@example.com
Comment: Some text goes here

Thank You,
Best Regards
Admin

Then your Template file (example_template.txt) should be:, (*8)

Hi {%name_of_receiver%},
If you wish to parse message coming from a website that states info like:
Name: {%sender_name%}
E-Mail: {%sender_email%}
Comment: {%comment%}

Thank You,
Best Regards
Admin

The output of a successful parsing job would be:, (*9)

Array(
    'name_of_receiver' => 'GitHub-er',
    'sender_name' => 'Pet Cat',
    'sender_email' => 'email@example.com',
    'Comment' => 'Some text goes here'
)

The Versions

12/04 2018

dev-master

9999999-dev https://github.com/DPOLocalBryceJ/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

 

text parser extract data php parser templates parsing regex parsing form parsing text parse

12/04 2018

1.2.7

1.2.7.0 https://github.com/DPOLocalBryceJ/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

 

text parser extract data php parser templates parsing regex parsing form parsing text parse

12/04 2018

1.2.5

1.2.5.0 https://github.com/DPOLocalBryceJ/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

 

text parser extract data php parser templates parsing regex parsing form parsing text parse

14/10 2017

1.2.0

1.2.0.0 https://github.com/aymanrb/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

 

The Development Requires

text parser extract data php parser templates parsing regex parsing form parsing text parse

14/10 2017

1.1.0-beta

1.1.0.0-beta https://github.com/aymanrb/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

 

The Development Requires

text parser extract data php parser

02/11 2014

v1.0.1-beta

1.0.1.0-beta https://github.com/aymanrb/php-unstructured-text-parser

A PHP Class to help extract text out of documents that are not structured in a processing friendly manner

  Sources   Download

MIT

The Requires

  • php >=5.3.0

 

The Development Requires

text parser extract data php parser