2017 © Pedro Peláez
 

library gdpr-dump

A drop-in replacement for mysqldump that optionally sanitizes DB fields for better GDPR conformity.

image

machbarmacher/gdpr-dump

A drop-in replacement for mysqldump that optionally sanitizes DB fields for better GDPR conformity.

  • Monday, July 9, 2018
  • by axel-rutz
  • Repository
  • 3 Watchers
  • 4 Stars
  • 25 Installations
  • PHP
  • 0 Dependents
  • 0 Suggesters
  • 1 Forks
  • 6 Open issues
  • 1 Versions
  • 19 % Grown

The README.md

GDPR Dump

A drop-in replacement for mysqldump that optionally sanitizes DB fields for better GDPR conformity., (*1)

It is based on the ifsnop/mysqldump-php library, and can in principle dump any database that PDO supports., (*2)

How to use

There are presently two ways of manipulating data, the first is by manipulating the actual SQL queries that are run on the server (given by the gdpr-expressions path), and the second is by replacing column output before the dump is generated (given by the gdpr-replacements option)., (*3)

$ ../vendor/bin/mysqldump drupal --host=mariadb --user=drupal --password=xxxxxxxx users_field_data --gdpr-expressions='{"users_field_data":{"name":"uid","mail":"uid","pass":"\"\""}}' --debug-sql
...
--
-- Dumping data for table `users_field_data`
--

/* SELECT `uid`,`langcode`,`preferred_langcode`,`preferred_admin_langcode`,uid as name,"" as pass,uid as mail,`timezone`,`status`,`created`,`changed`,`access`,`login`,uid as init,`default_langcode` FROM `users_field_data` */

INSERT INTO `users_field_data` VALUES (0,'en','en',NULL,'0','','0','',0,1523397207,1523397207,0,0,'0',1);
INSERT INTO `users_field_data` VALUES (1,'en','en',NULL,'1','','1','UTC',1,1523397207,1523397207,0,0,'1',1);

The fields to obfuscate are passed via a --gdpr-expressions parameter. Note that we use uid expression to satisfy unique keys., (*4)

The same without obfuscation:, (*5)

$ ../vendor/bin/mysqldump drupal --host=mariadb --user=drupal --password=xxxxxxxx users_field_data --debug-sql
...
--
-- Dumping data for table `users_field_data`
--

/* SELECT `uid`,`langcode`,`preferred_langcode`,`preferred_admin_langcode`,`name`,`pass`,`mail`,`timezone`,`status`,`created`,`changed`,`access`,`login`,`init`,`default_langcode` FROM `users_field_data` */

INSERT INTO `users_field_data` VALUES (0,'en','en',NULL,'',NULL,NULL,'',0,1523397207,1523397207,0,0,NULL,1);
INSERT INTO `users_field_data` VALUES (1,'en','en',NULL,'admin','$S$Eb6kZl.9OFjoa69Z05pzUhaZJ6vpKaGZVpnjAxxLJ7ip0zOwanEV','admin@example.com','UTC',1,1523397207,1523397207,0,0,'admin@example.com',1);

Using gdpr-replacements

This uses Faker for most of the column sanitization., (*6)

Presently, the tool searches for the "gdpr-replacements" option, either passed as a command line argument, or as part of a MySql options file., (*7)

The "gdpr-replacements" option expects a JSON string with the following format, (*8)

{"tableName" : {"columnName1": {"formatter": "formatterType", ...}, {"columnName2": {"formatter": "formatterType"}, ...}, ...}

Where formatterType is one of the following * name - generates a name * phoneNumber - generates a phone number * username - generates a random user name * password - generates a random password * email - generates a random email address * date - generates a date * longText - generates a sentence * number - generates a number * randomText - generates a sentence * text - generates a paragraph * uri - generates a URI * clear - generates an empty string, (*9)

This will replace the given column's value with Faker output., (*10)

You can also save replacements mapping to JSON file and use it with --gdpr-replacements-file option., (*11)

Use with drush

As this mimicks mysqldump, it can be use with drush, backup_migrate and any tool that uses mysqldump. Drush example:, (*12)

$ export PATH=/var/www/html/vendor/bin:$PATH
$ which mysqldump
/var/www/html/vendor/bin/mysqldump
$ drush sql-dump --tables-list=users_field_data --extra-dump=$'--gdpr-expressions=\'{"users_field_data":{"name":"uid","mail":"uid","init":"uid","pass":"\\"\\""}}\' --debug-sql'

MySqlOptions file

You are able to have your gdpr-expressions/replacement options set in a mysql options file file. It is to appear under the [mysqldump] section., (*13)

So, for example, you might have /etc/my.cnf with the following content, (*14)

[mysqldump]
gdpr-replacements='{"fakertest":{"name": {"formatter":"name"}, "telephone": {"formatter":"phoneNumber"}}}'

Status and further development

Currently this is a proof of concept to spark a community process. Especially the --gdpr-expressions option is neither handy to write for humans, nor does it scale well. Here we might need better options., (*15)

Contributors notes

  • Note that the project follows PSR-2 for formatting.

The Versions

09/07 2018

dev-master

9999999-dev

A drop-in replacement for mysqldump that optionally sanitizes DB fields for better GDPR conformity.

  Sources   Download

GPL-2.0-or-later

The Requires

 

The Development Requires

by Axel Rutz