Download the PHP package tomkyle/binning without Composer

On this page you can find all versions of the php package tomkyle/binning. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package binning

tomkyle/binning

Packagist PHP version PHP Composer

Determine the optimal 𝒌 number of bins for histogram creation and optimal bin width 𝒉 using various statistical methods. Its unified interface includes implementations of well-known binning rules such as:

Requirements

This library requires PHP 8.3 or newer. Support of older versions like markrogoyski/math-php provides for PHP 7.2+ is not planned.

Installation

Usage

The BinSelection class provides several methods for determining the optimal number of bins for histogram creation and optimal bin width. You can either use specific methods directly or the general suggestBins() and suggestBinWidth() methods with different strategies.

Determine Bin Width

Use the suggestBinWidth method to get the optimal bin width based on the selected method. The method returns the bin width, often referred to as 𝒉, as a float value.

Determine Number of Bins

Use the suggestBins method to get the optimal number of bins based on the selected method. The method returns the number of bins, often referred to as 𝒌, as an integer value.


Explicit method calls

You can also call the specific methods directly to get the bin width 𝒉 or number of bins 𝒌.

The result array contains additional information like the data range 𝑹, the inter-quartile range IQR, or standard deviation stddev, which can be useful for further analysis.


1. Pearson’s Square Root Rule (1892)

Simple rule using the square root of the sample size.

$$ k = \left \lceil \sqrt{n} \ \right \rceil $$


2. Sturges’s Rule (1926)

Based on the logarithm of the sample size. Good for normal distributions.

$$ k = 1 + \left \lceil \ \log_2(n) \ \right \rceil $$


3. Doane’s Rule (1976)

Improvement of Sturges’ rule that accounts for data skewness.

$$ k = 1 + \left\lceil \ \log_2(n) + \log_2\left(1 + \frac{|g1|}{\sigma{g_1}}\right) \ \right \rceil $$


4. Scott’s Rule (1979)

Based on the standard deviation and sample size. Good for continuous data.

$$ h = \frac{3.49\,\hat{\sigma}}{\sqrt[3]{n}} $$

$$ R = \max_i x_i - \min_i x_i $$

$$ k = \left \lceil \ \frac{R}{h} \ \right \rceil $$

The result is an array with keys width, bins, range, and stddev. Map them to variables like so:


5. Freedman-Diaconis Rule (1981)

Based on the interquartile range (IQR). Robust against outliers.

$$ IQR = Q_3 - Q_1 $$

$$ h = 2 \times \frac{\mathrm{IQR}}{\sqrt[3]{n}} $$

$$ R = \text{max}_i x_i - \text{min}_i x_i $$

$$ k = \left \lceil \frac{R}{h} \right \rceil $$

The result is an array with keys width, bins, range, and IQR. Map them to variables like so:


6. Terrell-Scott’s Rule (1985)

Uses the cube root of the sample size, generally provides more bins than Sturges. This is the original Rice Rule:

$$ k = \left \lceil \ \sqrt[3]{2n} \enspace \right \rceil = \left \lceil \ (2n)^{1/3} \ \right \rceil $$


7. Rice University Rule

Uses the cube root of the sample size, generally provides more bins than Sturges. Formula as taught by David M. Lane at Rice University. — N.B. This Rice Rule seems to be not the original. In fact, Terrell-Scott’s (1985) seems to be. Also note that both variants can yield different results under certain circumstances. This Lane’s variant from the early 2000s is however more commonly cited:

$$ k = 2 \times \left \lceil \ \sqrt[3]{n} \enspace \right \rceil = 2 \times \left \lceil \ n^{1/3} \ \right \rceil $$


Method Selection Guidelines

Rule Strengths & Weaknesses
Freedman–Diaconis Uses the IQR to set 𝒉, so it is robust against outliers and adapts to data spread.
⚠️ May over‐smooth heavily skewed or multi‐modal data when IQR is small.
Sturges’ Rule Very simple, works well for roughly normal, moderate-sized datasets.
⚠️ Ignores outliers and underestimates bin count for large or skewed samples.
Rice Rule Independent of data shape and easy to compute.
⚠️ Prone to over‐ or under‐smoothing when the distribution is heavy‐tailed or skewed.
Terrell–Scott Similar approach as Rice Rule but with asymptotically optimal MISE properties; gives more bins than Sturges and adapts better at large 𝒏.
⚠️ Still ignores skewness and outliers.
Square Root Rule Simply the square root, so it requires no distributional estimates.
⚠️ May produce too few bins for complex distributions — or too many for very noisy data.
Doane’s Rule Extends Sturges’ Rule by adding a skewness correction. Improving performance on asymmetric data.
⚠️ Requires estimating the third moment (skewness), which can be unstable for small 𝒏.
Scott’s Rule Uses standard deviation to minimize MISE, providing good balance for unimodal, symmetric data.
⚠️ Sensitive to outliers (inflated $\sigma$) and may underperform on skewed distributions.

Literature

Rubia, J.M.D.L. (2024): Rice University Rule to Determine the Number of Bins. Open Journal of Statistics, 14, 119-149. DOI: 10.4236/ojs.2024.141006

Wikipedia: Histogram / Number of bins and width https://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width

Practical Example

Error Handling

All methods will throw InvalidArgumentException for invalid inputs:

Development

Clone repo and install requirements

Watch source and run various tests

This will watch changes inside the src/ and tests/ directories and run a series of tests:

  1. Find and run the according unit test with PHPUnit.
  2. Find possible bugs and documentation isses using phpstan.
  3. Analyse code style and give hints on newer syntax using Rector.

Run PhpUnit


All versions of binning with dependencies

PHP Build Version
Package Version
Requires php Version ^8.3
markrogoyski/math-php Version ^2.11
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package tomkyle/binning contains the following files

Loading the files please wait ....