Download the PHP package jstewmc/chunker without Composer
On this page you can find all versions of the php package jstewmc/chunker. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package chunker
Chunker
A multi-byte-safe stream for reading very large files (or strings) as sequential chunks with PHP.
Breaking very large files or string into chunks and reading them one-at-a-time (aka, "chunking") can reduce memory consumption.
Unfortunately, chunking multi-byte-encoded files, like those using UTF-8
, can result in broken multi-byte characters. PHP's file functions, like file_get_contents()
, use limits in bytes. When the limit falls in the middle of a multi-type character, a malformed byte sequence, represented by the "?"
character, will result:
The example above would produce the following output:
The "?"
appears, because 12-bytes worth of the example file's content lands in the middle of the three-byte euro symbol. This results in a malformed byte sequence, which PHP represents with the "?"
character.
This library chunks very large files (and strings) in a multi-byte-safe way. It adjusts the chunk size slightly to ensure a well-formed byte sequence each time:
The example above would produce the following output:
Notice, the third chunk is only two one-byte characters, and the last chunk is a single three-byte character.
Installation
This library requires PHP 7.4+.
It is multi-platform, and we strive to make it run equally well on Windows, Linux, and OSX.
It should be installed via Composer. To do so, add the following line to the require
section of your composer.json
file, and run composer update
:
Instantiating a chunker
There are two types of chunkers: file and text.
Instantiating a file chunker
You can instantiate a File
chunker using a file pathname:
Instantiating a text chunker
You can instantiate a Text
chunker using a string:
Setting the character encoding
A file or string's character encoding lets PHP know how to understand its contents.
You can set the file (or string's) character encoding explicitly - using the chunker's constructor argument - or you can let this library set it implicitly - using your application's internal character encoding.
Setting character encoding explicitly
To set the file or string's character encoding explicitly, use the constructor's second argument, $encoding
. This is useful if you know the file or string differs from your application's encoding. An encoding must be a valid character encoding from PHP's Multi-byte string library:
Setting character encoding implicitly
To let this library set the file or string's character coding implicitly, don't pass a character encoding to the constructer. The chunk's encoding is assumed to be your application's internal character encoding, the value returned by PHP's mb_internal_encoding()
:
Setting the chunk size
The chunker's "chunk size" setting determines the memory consumption of each chunk. This library attempts to provide sensible defaults: 8,192 bytes for files and 2,000 characters for strings (a maximum of around 8,000 bytes).
To change the chunk size, use the constructor's third argument, $size
(remember, it's bytes for files and characters for text):
Consuming the chunks
Chunkers are designed to mimic a stream or array of chunks.
You can use the getCurrentChunk()
(alias, current()
), getNextChunk()
(alias, next()
), and getPreviousChunk()
(alias, previous()
) to navigate between chunks (if a chunk does not exist, the methods will return false):
These methods will usually be combined in a while
loop (keep in mind, it's important to strictly compare a chunk's value to false
; it may return Boolean false
, and it may also return a non-Boolean value which evaluates to false
):
The navigation methods are idempotent (i.e., calling next()
at the end of the input does not update the internal chunk index), but chunks are only guaranteed to be deterministic in one direction (i.e., the last chunk moving forward may not equal the first chunk moving backwards).
You can use the countChunks()
method to count the chunks in a file or string:
You can use the hasChunk()
or hasChunks()
to see whether or not the file or string has exactly one chunk or has one or more chunks, respectively:
You can use the hasPreviousChunk()
or hasNextChunk()
to see whether or not the file or string has a previous chunk or a next chunk, respectively:
Finally, you can use reset()
to reset the chunker's internal pointer to zero:
Contributing
Contributions are welcome! Here are the steps to get started:
License
This library is released under the MIT license.
All versions of chunker with dependencies
php Version ^7.4 || ^8.0