2017 © Pedro Peláez
 

library apache-log-parser

PHP library to parse Apache log files

image

benmorel/apache-log-parser

PHP library to parse Apache log files

  • Wednesday, May 30, 2018
  • by BenMorel
  • Repository
  • 1 Watchers
  • 0 Stars
  • 4 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 0 Forks
  • 0 Open issues
  • 2 Versions
  • 0 % Grown

The README.md

Apache Log Parser

A PHP library to parse Apache logs., (*1)

Build Status Coverage Status Latest Stable Version License, (*2)

Installation

This library is installable via Composer. Just run:, (*3)

composer require benmorel/apache-log-parser

Requirements

This library requires PHP 7.1 or later., (*4)

Project status & release process

This library is under development., (*5)

The current releases are numbered 0.x.y. When a non-breaking change is introduced (adding new methods, optimizing existing code, etc.), y is incremented., (*6)

When a breaking change is introduced, a new 0.x version cycle is always started., (*7)

It is therefore safe to lock your project to a given release cycle, such as 0.1.*., (*8)

If you need to upgrade to a newer release cycle, check the release history for a list of changes introduced by each further 0.x.0 version., (*9)

Package contents

This library provides a single class, Parser., (*10)

Quick start

First construct a Parser object with the LogFormat defined in the httpd.conf file of the server that generated the log file:, (*11)

use BenMorel\ApacheLogParser\Parser;

$logFormat = "%h %l %u %t \"%{Host}i\" \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"";
$parser = new Parser($logFormat);

The library converts every format string of your log format to a field name; the list of fields can be accessed through the getFieldNames() method:, (*12)

var_export(
    $parser->getFieldNames()
);
array (
  0 => 'remoteHostname',
  1 => 'remoteLogname',
  2 => 'remoteUser',
  3 => 'time',
  4 => 'requestHeader:Host',
  5 => 'firstRequestLine',
  6 => 'status',
  7 => 'responseSize',
  8 => 'requestHeader:Referer',
  9 => 'requestHeader:User-Agent',
)

You're then ready to parse a single line of your log file: the parse() method accepts the log line, and a boolean to indicate whether you want the results as a numeric array, whose keys match the ones of the field names array:, (*13)

$line = '1.2.3.4 - - [30/May/2018:15:00:23 +0200] "www.example.com" "GET / HTTP/1.0" 200 1234 "-" "Mozilla/5.0';

var_export(
    $parser->parse($line, false)
);
array (
  0 => '1.2.3.4',
  1 => '-',
  2 => '-',
  3 => '30/May/2018:15:00:23 +0200',
  4 => 'www.example.com',
  5 => 'GET / HTTP/1.0',
  6 => '200',
  7 => '1234',
  8 => '-',
  9 => 'Mozilla/5.0',
)

Or as an associative array, with the field names as keys:, (*14)

var_export(
    $parser->parse($line, true)
);
array (
  'remoteHostname'           => '1.2.3.4',
  'remoteLogname'            => '-',
  'remoteUser'               => '-',
  'time'                     => '30/May/2018:15:00:23 +0200',
  'requestHeader:Host'       => 'www.example.com',
  'firstRequestLine'         => 'GET / HTTP/1.0',
  'status'                   => '200',
  'responseSize'             => '1234',
  'requestHeader:Referer'    => '-',
  'requestHeader:User-Agent' => 'Mozilla/5.0',
)

If a line cannot be parsed, an InvalidArgumentException is thrown. Be sure to wrap your parse() calls in a try-catch block:, (*15)

try {
    $parser->parse($line, true)
} catch (\InvalidArgumentException $e) {
    // ...
}

Field names returned by the library

This table shows how format strings are mapped to field names by the library:, (*16)

Format string Field name
%a clientIp
%{c}a clientIp:c
%A localIp
%B responseSize
%b responseSize
%{VARNAME}C cookie:VARNAME
%D responseTime
%{VARNAME}e env:VARNAME
%f filename
%h remoteHostname
%H requestProtocol
%{VARNAME}i requestHeader:VARNAME
%k keepaliveRequests
%l remoteLogname
%L requestLogId
%m requestMethod
%{VARNAME}n note:VARNAME
%{VARNAME}o responseHeader:VARNAME
%p canonicalPort
%{FORMAT}p canonicalPort:FORMAT
%P processId
%{FORMAT}P processId:FORMAT
%q queryString
%r firstRequestLine
%R handler
%s status
%t time
%{FORMAT}t time:FORMAT
%T timeToServe
%{UNIT}T timeToServe:UNIT
%u remoteUser
%U urlPath
%v serverName
%V serverName
%X connectionStatus
%I bytesReceived
%O bytesSent
%S bytesTransferred
%{VARNAME}^ti requestTrailerLine:VARNAME
%{VARNAME}^to responseTrailerLine:VARNAME

If two or more format strings yield the same field name, the second one will get a :2 suffix, the third one a :3 suffix, etc., (*17)

Performance notes

You can expect to parse more than 250,000 records per second (> 50 MiB/s) when reading logs from a file on a modern server with an SSD drive., (*18)

Returning records as an associative array comes with a small performance penalty of about 6%., (*19)

The Versions

30/05 2018

dev-master

9999999-dev

PHP library to parse Apache log files

  Sources   Download

MIT

The Requires

  • php >=7.1

 

The Development Requires

log parser apache parse

30/05 2018

0.1.0

0.1.0.0

PHP library to parse Apache log files

  Sources   Download

MIT

The Requires

  • php >=7.1

 

The Development Requires

log parser apache parse