Download the PHP package php-anonymizer/anonymizer without Composer
On this page you can find all versions of the php package php-anonymizer/anonymizer. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download php-anonymizer/anonymizer
More information about php-anonymizer/anonymizer
Files in php-anonymizer/anonymizer
Package anonymizer
Short Description Library to remove confidential data from different data containers
License proprietary
Informations about the package anonymizer
Toni's Data Anonymization Toolkit
00 Preliminary
Table of Contents
- 00 Preliminary
- Table of Contents
- Purpose
- Getting started
- 01 Basic usage
- Creating an Anonymizer instance
- Example Output
- 02 Writing definition rules
- 02.01 Basic rule syntax
- 02.02 Array rule syntax
- 02.03 Property access syntax
- 02.04 Fake data type annotation
- 03 Using Faker as a data provider
- 03.01 Use default Faker instance of builder
- 03.02 Use custom Faker instance
- 03.03 Set seed for Faker instance
- 04 Data Encoding
- 04.01 NoOpEncoder
- 04.02 CloneEncoder
- 04.03 JsonEncoder
- 04.04 YamlEncoder
- 04.05 SymfonyEncoder
- 06 Extended Information
- 06.01 Manual setup of Anonymizer
- 06.01.01 RuleSet parser
- 06.01.02 DependencyChecker
- 06.01.03 DataAccessProvider
- 06.01.04 DataGenerationProvider
- 06.01.05 DataEncodingProvider
- 06.01.06 DataProcessor
- 06.01.07 Anonymizer
- 06.02 Builder setup of Anonymizer
- 06.01 Manual setup of Anonymizer
Purpose
This library is a simple data anonymization toolkit that allows to define rules for anonymizing data in a structured
way. By using this library it is possible to skip writing a lot of boilerplate code to navigate through your data
structures again and again. The library is designed to be flexible and extensible, so that it can be used in a wide
range of use cases. It also ships with support of fakerphp/faker
as a provider for randomized fake data.
Getting started
01 Basic usage
Creating an Anonymizer instance
If you want to start with the most basic usage of modifying data in an array structure, all you have to do is to create
an instance of the Anonymizer
class, register a rule set by using registerRuleSet
and call the run
method with the
data you want to modify.
Example Output
02 Writing definition rules
02.01 Basic rule syntax
The default syntax when navigating the data to be anonymized is using dot notation. Every word separated by a dot
represents a level in the data structure. The following example shows how to write a rule for anonymizing the
first_name
and last_name
fields in the order.person
structure.
02.02 Array rule syntax
Additionally it is possible to make use of array notation to tell the anonymizer engine that there is a list of items at
a certain level. This can be realized by putting []
in front of a keyword.
02.03 Property access syntax [complex rule parser only]
The previous examples all assumed that the data structure to be passed is an array. Apart from that this library also supports different ways of accessing object properties. This can be passed to any layer directly after the name of the property.
Note: The definition of the access method is optional. If omitted, the anonymizer will fall back to the configured default access method.
Example with direct property access on object. This methods requires the properties to be public and not readonly.
Example with property access via getter and setter method. This method requires the properties to have a matching getPropertyName and setPropertyName method.
The safest way to access properties on objects, is to use reflection.
Of course it is also possible to mix these access methods in one rule set and make the array property access more verbose.
In case there are any more specific requirements for accessing object properties, it is possible to implement a custom
data accessor by implementing the PhpAnonymizer\Anonymizer\DataAccess\DataAccessorInterface
.
Supported access methods as of now are:
Access Method | Description |
---|---|
array |
Access array elements by key name |
property |
Access object properties by name |
reflection |
Access object properties via reflection classes |
setter |
Access object properties by getter and setter methods |
02.04 Fake data type annotation [complex rule parser only]
Until now all that we have achieved, is to replace the data with starred out place holders that have the same length as the original data. As sometimes it is more desirable to replace the data with more real world-like fake data, it is possible to tell the Anonymizer which kind of data we want to set as a field's replacement.
Note: to get this feature working, the fakerphp/faker
library must be installed. See section
03 Using Faker as a data provider
for more information.
To introduce the use of fake data, you can add a type annotation to the property field with a precending #
symbol
within the square brackets, e.g. order.person.firstName[#firstName]
.
Of course, also in this case it is possible to mix the fake data type annotations with the other access methods. In this
case, the fake data type must be preceded by the access method, e.g. order.person.firstName[property#firstName]
.
03 Using Faker as a data provider
Before using Faker as a data provider, you need to install the fakerphp/faker
library. This can be realized by issuing
a simple composer command.
Afterwards, you need to register an instance of Faker to the Anonymizer instance. There are different ways of achieving this.
03.01 Use default Faker instance of builder
By default it is possible to use the generic Faker instance that is shipped with the AnonymizerBuilder. This instance is
created with the default locale en_US
and all default providers.
03.02 Use custom Faker instance
If you want to have more control over how the Faker instance is created, you can pass an instance of Faker\Generator
directly to the AnonymizerBuilder.
03.03 Set seed for Faker instance
It is also possible to set a seed for the Faker instance. This can be useful if you want to have reproducible results. In our case we use a string as a keyword that will be hashed to an integer value (md5).
04 Data Encoding
The Analyzer
main service class supports the use of a data encoding class. The sole purpose of this data encoding
class is to make the input data accessible for transformation:
Encoder | Description | Input Encode | Output Encode | Input Decode | Output Decode |
---|---|---|---|---|---|
NoOpEncoder | does not change the input data | mixed |
mixed |
mixed |
mixed |
CloneEncoder | clones objects on decode | mixed |
mixed |
mixed |
mixed |
JsonEncoder | encodes data as JSON | array |
string |
string |
array |
YamlEncoder | encodes data as YAML | array |
string |
string |
array |
SymfonyEncoder | transforms objects to array | object |
array |
array |
object |
SymfonyToJsonEncoder | transforms object to json | object |
array |
array |
string |
SymfonyToArrayEncoder | transforms objects to array | object |
array |
array |
array |
ArrayToJsonEncoder | transforms array to json | array |
array |
array |
string |
04.01 NoOpEncoder
The NoOpEncoder
simply does nothing. It takes an argument and passes the same object back to the consumer on both
methods (encode
and decode
).
Notice: It will pass arguments by value, not by reference, so if you pass a non-object value, it will NOT update the input variable as long as you don't override it manually.
Example output for the noop encoder on an array.
Example output for the noop encoder on an object.
04.02 CloneEncoder
When calling the decode
method of the CloneEncoder
, this encoder creates a COPY of the input values.
By this mean, the Anonymizer will update the objects within the data to be anonymized on a new copy of the object, rather than manipulating the initial object.
As the clone
keyword only creates a shallow copy of the top level object, we use the very popular library
myclabs/deep-copy
to achieve a recursive cloning of all nested objects.
Example output for the clone encoder after data has been changed.
In the next step, we change our rule set to not modify any data within the cloned object.
Example output for the clone encoder after data hasn't been changed.
04.03 JsonEncoder
The JsonEncoder
is an encoder that can help you handle json data. With this encoder it is possible to modify sensitive
data directly within the json document's string representation.
For the encoder to work, the json
php extension is required (which is part of the php core since 8.0 anyway).
The decode
method will transform a json string
into an array
, the encode
method will transform an array
back
into json string
notation (single line without PRETTY_PRINT).
04.04 YamlEncoder
The YamlEncoder
is an encoder that can help you handle yaml data. With this encoder it is possible to modify sensitive
data directly within the yaml document's string representation.
For the encoder to work, the yaml
php extension is required. This can be installed via pecl, for instance. Some Linux
distributions also offer pre-compiled packages as an alternative to manual building already.
The decode
method will transform a yaml string
into an array
, the encode
method will transform an array
back
into yaml string
notation.
04.05 SymfonyEncoder
The last encoder is the SymfonyEncoder
. This encoder is a bit more complex than the others, as it is able to transform
objects into arrays and vice versa.
For this encoder to work, you will need to have the symfony/serializer
package installed and have to setup a
Normalizer and a Denormalizer that follow Symfony's NormalizerInterface and DenormalizerInterface (e.g.
ObjectNormalizer).
To make the SymfonyEncoder
work, it is essential that your object can be normalized and denormalized properly by using
these Normalizer and Denormalizer objects.
You can install the symfony/serializer
package via composer: