Download the PHP package cerbero/json-parser without Composer
On this page you can find all versions of the php package cerbero/json-parser. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download cerbero/json-parser
More information about cerbero/json-parser
Files in cerbero/json-parser
Package json-parser
Short Description Zero-dependencies pull parser to read large JSON from any source in a memory-efficient way.
License MIT
Homepage https://github.com/cerbero90/json-parser
Informations about the package json-parser
๐งฉ JSON Parser
Zero-dependencies pull parser to read large JSON from any source in a memory-efficient way.
๐ฆ Install
Via Composer:
๐ฎ Usage
- ๐ฃ Basics
- ๐ง Sources
- ๐ฏ Pointers
- ๐ผ Lazy pointers
- โ๏ธ Decoders
- ๐ข Errors handling
- โณ Progress
- ๐ Settings
๐ฃ Basics
JSON Parser provides a minimal API to read large JSON from any source:
Depending on our code style, we can instantiate the parser in 3 different ways:
If we don't want to use foreach()
to loop through each key and value, we can chain the traverse()
method:
โ ๏ธ Please note the parameters order of the callback: the value is passed before the key.
๐ง Sources
A JSON source is any data point that provides a JSON. A wide range of sources are supported by default:
- strings, e.g.
{"foo":"bar"}
- iterables, i.e. arrays or instances of
Traversable
- file paths, e.g.
/path/to/large.json
- resources, e.g. streams
- API endpoint URLs, e.g.
https://endpoint.json
or any instance ofPsr\Http\Message\UriInterface
- PSR-7 requests, i.e. any instance of
Psr\Http\Message\RequestInterface
- PSR-7 messages, i.e. any instance of
Psr\Http\Message\MessageInterface
- PSR-7 streams, i.e. any instance of
Psr\Http\Message\StreamInterface
- Laravel HTTP client requests, i.e. any instance of
Illuminate\Http\Client\Request
- Laravel HTTP client responses, i.e. any instance of
Illuminate\Http\Client\Response
- user-defined sources, i.e. any instance of
Cerbero\JsonParser\Sources\Source
If the source we need to parse is not supported by default, we can implement our own custom source.
Click here to see how to implement a custom source.
To implement a custom source, we need to extend `Source` and implement 3 methods: The parent class `Source` gives us access to 2 properties: - `$source`: the JSON source we pass to the parser, i.e.: `new JsonParser($source)` - `$config`: the configuration we set by chaining methods like `$parser->pointer('/foo')` The method `getIterator()` defines the logic to read the JSON source in a memory-efficient way. It feeds the parser with small pieces of JSON. Please refer to the [already existing sources](https://github.com/cerbero90/json-parser/tree/master/src/Sources) to see some implementations. The method `matches()` determines whether the JSON source passed to the parser can be handled by our custom implementation. In other words, we are telling the parser if it should use our class for the JSON to parse. Finally, `calculateSize()` computes the whole size of the JSON source. It's used to track the [parsing progress](#-progress), however it's not always possible to know the size of a JSON source. In this case, or if we don't need to track the progress, we can return `null`. Now that we have implemented our custom source, we can pass it to the parser: If you find yourself implementing the same custom source in different projects, feel free to send a PR and we will consider to support your custom source by default. Thank you in advance for any contribution!๐ฏ Pointers
A JSON pointer is a standard used to point to nodes within a JSON. This package leverages JSON pointers to extract only some sub-trees from large JSONs.
Consider this JSON for example. To extract only the first gender and avoid parsing the rest of the JSON, we can set the /results/0/gender
pointer:
JSON Parser takes advantage of the -
wildcard to point to any array index, so we can extract all the genders with the /results/-/gender
pointer:
If we want to extract more sub-trees, we can set multiple pointers. Let's extract all genders and countries:
โ ๏ธ Intersecting pointers like
/foo
and/foo/bar
is not allowed but intersecting wildcards likefoo/-/bar
andfoo/0/bar
is possible.
We can also specify a callback to execute when JSON pointers are found. This is handy when we have different pointers and we need to run custom logic for each of them:
โ ๏ธ Please note the parameters order of the callbacks: the value is passed before the key.
The same can also be achieved by chaining the method pointer()
multiple times:
Pointer callbacks can also be used to customize a key. We can achieve that by updating the key reference:
If the callbacks are enough to handle the pointers and we don't need to run any common logic for all pointers, we can avoid to manually call foreach()
by chaining the method traverse()
:
Otherwise if some common logic for all pointers is needed but we prefer methods chaining to manual loops, we can pass a callback to the traverse()
method:
โ ๏ธ Please note the parameters order of the callbacks: the value is passed before the key.
Sometimes the sub-trees extracted by pointers are small enough to be kept entirely in memory. In such cases, we can chain toArray()
to eager load the extracted sub-trees into an array:
๐ผ Lazy pointers
JSON Parser only keeps one key and one value in memory at a time. However, if the value is a large array or object, it may be inefficient or even impossible to keep it all in memory.
To solve this problem, we can use lazy pointers. These pointers recursively keep in memory only one key and one value at a time for any nested array or object.
Lazy pointers return a lightweight instance of Cerbero\JsonParser\Tokens\Parser
instead of the actual large value. To lazy load nested keys and values, we can then loop through the parser:
As mentioned above, lazy pointers are recursive. This means that no nested objects or arrays will ever be kept in memory:
To lazily parse the entire JSON, we can simply chain the lazy()
method:
We can recursively wrap any instance of Cerbero\JsonParser\Tokens\Parser
by chaining wrap()
. This lets us wrap lazy loaded JSON arrays and objects into classes with advanced functionalities, like mapping or filtering:
โน๏ธ If your wrapper class implements the method
toArray()
, such method will be called when eager loading sub-trees into an array.
Lazy pointers also have all the other functionalities of normal pointers: they accept callbacks, can be set one by one or all together, can be eager loaded into an array and can be mixed with normal pointers as well:
โ๏ธ Decoders
By default JSON Parser uses the built-in PHP function json_decode()
to decode one key and value at a time.
Normally it decodes values to associative arrays but, if we prefer to decode values to objects, we can set a custom decoder:
The simdjson extension offers a decoder faster than json_decode()
that can be installed via pecl install simdjson
if your server satisfies the requirements. JSON Parser leverages the simdjson decoder by default if the extension is loaded.
If we need a decoder that is not supported by default, we can implement our custom one.
Click here to see how to implement a custom decoder.
To create a custom decoder, we need to implement the `Decoder` interface and implement 1 method: The method `decode()` defines the logic to decode the given JSON value and it needs to return an instance of `DecodedValue` both in case of success or failure. To make custom decoder implementations even easier, JSON Parser provides an [abstract decoder](https://github.com/cerbero90/json-parser/tree/master/src/Decoders/AbstractDecoder.php) that hydrates `DecodedValue` for us so that we just need to define how a JSON value should be decoded: > โ ๏ธ Please make sure to throw an exception in `decodeJson()` if the decoding process fails. Now that we have implemented our custom decoder, we can set it like this: To see some implementation examples, please refer to the [already existing decoders](https://github.com/cerbero90/json-parser/tree/master/src/Decoders). If you find yourself implementing the same custom decoder in different projects, feel free to send a PR and we will consider to support your custom decoder by default. Thank you in advance for any contribution!๐ข Errors handling
Not all JSONs are valid, some may present syntax errors due to an incorrect structure (e.g. [}
) or decoding errors when values can't be decoded properly (e.g. [1a]
). JSON Parser allows us to intervene and define the logic to run when these issues occur:
We can even replace invalid values with placeholders to avoid that the entire JSON parsing fails because of them:
For more advanced decoding errors patching, we can pass a closure that has access to the DecodedValue
instance:
Any exception thrown by this package implements the JsonParserException
interface. This makes it easy to handle all exceptions in a single catch block:
For reference, here is a comprehensive table of all the exceptions thrown by this package: | Cerbero\JsonParser\Exceptions\ |
thrown when |
---|---|---|
DecodingException |
a value in the JSON can't be decoded | |
GuzzleRequiredException |
Guzzle is not installed and the JSON source is an endpoint | |
IntersectingPointersException |
two JSON pointers intersect | |
InvalidPointerException |
a JSON pointer syntax is not valid | |
SyntaxException |
the JSON structure is not valid | |
UnsupportedSourceException |
a JSON source is not supported |
โณ Progress
When processing large JSONs, it can be helpful to track the parsing progress. JSON Parser provides convenient methods for accessing all the progress details:
The total size of a JSON is calculated differently depending on the source. In some cases, it may not be possible to determine the size of a JSON and only the current progress is known:
๐ Settings
JSON Parser also provides other settings to fine-tune the parsing process. For example we can set the number of bytes to read when parsing JSON strings or streams:
๐ Change log
Please see CHANGELOG for more information on what has changed recently.
๐งช Testing
๐ Contributing
Please see CODE_OF_CONDUCT for details.
๐งฏ Security
If you discover any security related issues, please email [email protected] instead of using the issue tracker.
๐ Credits
- Andrea Marco Sartori
- All Contributors
โ๏ธ License
The MIT License (MIT). Please see License File for more information.