2017 © Pedro Peláez
 

library hyperloglog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

image

joegreen0991/hyperloglog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

  • Friday, February 6, 2015
  • by mrjgreen
  • Repository
  • 1 Watchers
  • 13 Stars
  • 924 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 5 Forks
  • 0 Open issues
  • 3 Versions
  • 0 % Grown

The README.md

HyperLogLog & MinHash

PHP implementation of the HyperLogLog algorithm. Based on Antirez/Redis implementation., (*1)

Resources

Note!

This version has been tuned to work with a P value of 14. This is a register size of 2^14 Bytes = 16KB, (*2)

There is a large bias that can be seen in the graphs below, which begins when the set cardinality reaches around 2^P * 2.5. Polynomial regression has been used to calculate bias offsets BUT ONLY FOR P = 14. You are free to change the P value but the bias offsets will not be applied. Check out the code for more information, (*3)

Some Professional Looking Graphs

HyperLogLog

P=14 HyperLogLog P = 14, (*4)

P=16 Note the offset bias around 2.5 * 2^16 ~= 165,000 HyerLogLog P = 16, (*5)

P=20 Note the offset bias around 2.5 * 2^20 ~= 2,600,000 HyerLogLog P = 20, (*6)

MinHash

K=8192 MinHash K = 8129, (*7)

The Versions

06/02 2015

dev-master

9999999-dev http://www.github.com/joegreen0991/HyperLogLog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

  Sources   Download

MIT

The Requires

  • php >=5.3.0

 

unique hyperloglog hyper log log cardinalities cardinality min hash

06/02 2015

dev-bias

dev-bias http://www.github.com/joegreen0991/HyperLogLog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

  Sources   Download

MIT

The Requires

  • php >=5.3.0

 

unique hyperloglog hyper log log cardinalities cardinality min hash

10/07 2014

v1.0.0

1.0.0.0 http://www.github.com/joegreen0991/HyperLogLog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

  Sources   Download

MIT

The Requires

  • php >=5.3.0

 

unique hyperloglog hyper log log cardinalities cardinality min hash