Download the PHP package white-rabbit-1-sketch/php-file-hash-map without Composer

On this page you can find all versions of the php package white-rabbit-1-sketch/php-file-hash-map. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package php-file-hash-map

Php File Hash Map

Latest Version Phpunit codecov

PhpFileHashMap is a PHP implementation of a file-based hash map that stores key-value pairs in a binary file. The hash map operates on a file system level, which makes it suitable for handling large amounts of data with minimal memory usage. This implementation allows persisting hash map data to a file while providing standard hash map operations like set, get, remove, and more.

Table of Contents

Features

Warning!

This is not a data storage solution and was never intended to be used as one. Essentially, it is an implementation of the hash map data structure with data stored on disk, and its current applicability is specifically within this context. But of course, you can use it as storage if it suits your task and you understand all the nuances.

Performance Benchmarks

The performance of this file-based hash map may vary depending on the system configuration and the number of elements. On my MacBook Air M2, the hash map performed as follows (single thread):

Installation

You can install PhpFileHashMap via Composer by adding the following to your composer.json:

Usage

Creating a Hash Map

Adding Data

Retrieving Data

Removing Data

Checking for Key Existence

Removing Data

Iterating Over Keys and Values

Both the keys() and values() methods iterate over all the elements in the hash map and return the respective keys and values.

Both of these methods require scanning the entire hash map, including both the Map Index Section and the Heap Section, to collect the keys or values. This means that they need to read through all buckets, including any inactive (deleted) ones, and this can be resource-intensive in terms of time if the hash map contains a large number of entries or deleted buckets.

However, it’s important to note that these operations are not memory-intensive. Since the methods use generators, they do not load all keys or values into memory at once, making them efficient in terms of memory usage. Only one key or value is held in memory at a time during iteration.

Clearing the Hash Map

Nuances and Performance Considerations

This file-based hash map efficiently resolves collisions by utilizing chaining (linked lists of buckets). However, as the number of collisions increases, the performance may degrade. This degradation becomes particularly noticeable during write operations (insertion and deletion). Therefore, to ensure optimal performance, it is recommended to keep the hash map at a reasonable size relative to the expected number of elements.

Recommended Hash Map Size

For best performance, the size of the hash map should be chosen based on the estimated number of elements you plan to store. A good rule of thumb is to set the map size to a value that is roughly 1.5 to 2 times larger than the expected number of elements. This helps reduce the likelihood of collisions and ensures fast access times.

For example:

By keeping the number of collisions low, you maintain fast read and write speeds, especially in the case of write-heavy workloads.

Data File and Custom Location

By default, the hash map automatically creates a data file in the system's temporary directory. This file is used to store the hash map's data persistently.

Defragmentation

When keys are removed from the hash map, the corresponding buckets are not physically deleted from the file. Instead, they are marked as deleted. This is done to avoid the performance cost of file operations, as physically deleting data would require shifting the file contents, which can be expensive.

However, over time, especially with many deletions, the file may accumulate a significant number of deleted buckets, which could reduce performance. In such cases, it is advisable to perform defragmentation to reclaim space and optimize the file layout.

The defrag() method reorganizes the entire hash map file by recalculating the entire structure from scratch. This includes removing any deleted buckets and restructuring the map for better performance.

Note: Defragmentation is a resource-intensive operation, especially for large hash maps, as it requires reading and rewriting the entire file. Therefore, it should be used carefully and ideally not too frequently.

Serialization

By default, this hash map uses PHP's built-in serialize() and unserialize() functions to handle the serialization of values stored in the map. This allows you to store any PHP data type, including objects, arrays, and other complex structures.

Serialization Override

The default methods for serializing and unserializing data are:

These methods can be easily overridden if you need to use a different serialization format (e.g., JSON, MessagePack, etc.) or a custom approach. By overriding these methods, you can control how data is converted before being stored in the hash map and after being retrieved.

Serialization of Closures

To handle this, you can use the opis/closure library to serialize and unserialize closures.

To enable serialization of closures, you need to install the opis/closure library. This can be done via Composer:

Once the library is installed, you can easily customize the serialization and unserialization methods of your hash map to handle closures. Here's an example of how to do it:

Restrictions

When using PhpFileHashMap, keep in mind the following limitations due to its file system-based storage:

1. Concurrent Access

If multiple processes attempt to access the same hash map file simultaneously, race conditions may occur. To ensure data integrity, you must implement locking mechanisms when working with the same file in parallel processes.

Locking is intentionally not implemented in this library to keep it lightweight and to give developers the freedom to choose their preferred locking strategy. Examples of possible solutions include:

2. Distributed Systems

This library does not handle synchronization in distributed environments. If you need to share the same hash map file across multiple machines, synchronization must be implemented externally.

Examples of how this can be addressed:

These restrictions are by design to maintain the simplicity and portability of PhpFileHashMap, leaving implementation details of complex infrastructure to the developer.

Why Choose This Library Over SQLite?

In summary, if you need a high-performance, simple key-value storage solution without the overhead of a full-fledged database engine, this library offers a more optimized, flexible, and customizable alternative to SQLite.

Data File Structure

The file structure of the hash map is designed to efficiently manage large amounts of data. It consists of two main sections: the Map Index Section and the Heap Section.

1. Map Index Section

The Map Index Section is located at the beginning of the file. It contains a series of integers, each representing the offset of a bucket in the Heap Section. The number of entries in the index is equal to the number of buckets in the hash map.

2. Heap Section

The Heap Section contains all the actual data for the hash map’s buckets. Each bucket is a block of data that includes the following elements:

Additionally, the Heap Section begins with two integers that hold the following data:

The rest of the heap consists of individual buckets, which contain the serialized data for each key-value pair.

File Layout Example

Author and License

Author: Mikhail Chuloshnikov

License: MIT License

This library is released under the MIT License. See the LICENSE file for more details.


All versions of php-file-hash-map with dependencies

PHP Build Version
Package Version
Requires ext-mbstring Version *
php Version ^8.1
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package white-rabbit-1-sketch/php-file-hash-map contains the following files

Loading the files please wait ....