Download the PHP package hi-folks/statistics without Composer
On this page you can find all versions of the php package hi-folks/statistics. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download hi-folks/statistics
More information about hi-folks/statistics
Files in hi-folks/statistics
Package statistics
Short Description PHP package that provides functions for calculating mathematical statistics of numeric data.
License MIT
Homepage https://github.com/hi-folks/statistics
Informations about the package statistics
Statistics PHP package
Introducing a PHP package enabling comprehensive mathematical statistics calculations on numeric data.
I've put together a package of useful statistical functions.
These functions originally stemmed from my exploration of FIT files, which contain a wealth of data about sports activities. Within these files, you can find detailed information on metrics such as Heart Rate, Speed, Cadence, Power, and more. I developed these statistical functions to help gain deeper insights into the numerical data and performance of these sports activities.
The functions provided by this package, cover a range of measures, including mean, mode, median, range, quantiles, first quartile (25th percentile), third quartile (75th percentile), frequency tables (both cumulative and relative), standard deviation (applicable to both populations and samples), and variance (once again, for populations and samples).
This package is inspired by the Python statistics module
Installation
You can install the package via composer:
Usage
Stat class
Stat class has methods to calculate an average or typical value from a population or sample. This class provides methods for calculating mathematical statistics of numeric data. The various mathematical statistics are listed below:
Mathematical Statistic | Description |
---|---|
mean() |
arithmetic mean or "average" of data |
median() |
median or "middle value" of data |
medianLow() |
low median of data |
medianHigh() |
high median of data |
mode() |
single mode (most common value) of discrete or nominal data |
multimode() |
list of modes (most common values) of discrete or nominal data |
quantiles() |
cut points dividing the range of a probability distribution into continuous intervals with equal probabilities |
thirdQuartile() |
3rd quartile, is the value at which 75 percent of the data is below it |
firstQuartile() |
first quartile, is the value at which 25 percent of the data is below it |
pstdev() |
Population standard deviation |
stdev() |
Sample standard deviation |
pvariance() |
variance for a population |
variance() |
variance for a sample |
geometricMean() |
geometric mean |
harmonicMean() |
harmonic mean |
correlation() |
the Pearson’s correlation coefficient for two inputs |
covariance() |
the sample covariance of two inputs |
linearRegression() |
return the slope and intercept of simple linear regression parameters estimated using ordinary least squares |
Stat::mean( array $data )
Return the sample arithmetic mean of the array $data. The arithmetic mean is the sum of the data divided by the number of data points. It is commonly called “the average”, although it is only one of many mathematical averages. It is a measure of the central location of the data.
Stat::geometricMean( array $data )
The geometric mean indicates the central tendency or typical value of the data using the product of the values (as opposed to the arithmetic mean which uses their sum).
Stat::harmonicMean( array $data )
The harmonic mean is the reciprocal of the arithmetic mean() of the reciprocals of the data. For example, the harmonic mean of three values a, b, and c will be equivalent to 3/(1/a + 1/b + 1/c). If one of the values is zero, the result will be zero.
You can also calculate the harmonic weighted mean. Suppose a car travels 40 km/hr for 5 km, and when traffic clears, speeds up to 60 km/hr for the remaining 30 km of the journey. What is the average speed?
where:
- 40, 60: are the elements
- 5, 30: are the weights for each element (the first weight is the weight of the first element, the second one is the weight of the second element)
- 1: is the decimal numbers you want to round
Stat::median( array $data )
Return the median (middle value) of numeric data, using the common “mean of middle two” method.
Stat::medianLow( array $data )
Return the low median of numeric data. The low median is always a member of the data set. When the number of data points is odd, the middle value is returned. When it is even, the smaller of the two middle values is returned.
Stat::medianHigh( array $data )
Return the high median of data. The high median is always a member of the data set. When the number of data points is odd, the middle value is returned. When it is even, the larger of the two middle values is returned.
Stat::quantiles( array $data, $n=4, $round=null )
Divide data into n continuous intervals with equal probability. Returns a list of n - 1 cut points separating the intervals. Set n to 4 for quartiles (the default). Set n to 10 for deciles. Set n to 100 for percentiles which gives the 99 cut points that separate data into 100 equal-sized groups.
Stat::firstQuartile( array $data, $round=null )
The lower quartile, or first quartile (Q1), is the value under which 25% of data points are found when they are arranged in increasing order.
Stat::thirdQuartile( array $data, $round=null )
The upper quartile, or third quartile (Q3), is the value under which 75% of data points are found when arranged in increasing order.
Stat::pstdev( array $data )
Return the Population Standard Deviation, a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.
Stat::stdev( array $data )
Return the Sample Standard Deviation, a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.
Stat::variance ( array $data)
Variance is a measure of dispersion of data points from the mean. Low variance indicates that data points are generally similar and do not vary widely from the mean. High variance indicates that data values have greater variability and are more widely dispersed from the mean.
To calculate the variance from a sample:
If you need to calculate the variance on the whole population and not just on a sample you need to use pvariance method:
Stat::covariance ( array $x , array $y )
Covariance, static method, returns the sample covariance of two inputs $x and $y. Covariance is a measure of the joint variability of two inputs.
Stat::correlation ( array $x , array $y )
Return the Pearson’s correlation coefficient for two inputs. Pearson’s correlation coefficient r takes values between -1 and +1. It measures the strength and direction of the linear relationship, where +1 means very strong, positive linear relationship, -1 very strong, negative linear relationship, and 0 no linear relationship.
Stat::linearRegression ( array $x , array $y )
Return the slope and intercept of simple linear regression parameters estimated using ordinary least squares. Simple linear regression describes the relationship between an independent variable $x and a dependent variable $y in terms of a linear function.
What happens in 2022, according to the samples above?
Freq class
With Statistics package you can calculate frequency table. A frequency table lists the frequency of various outcomes in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval.
Freq::frequencies( array $data )
You can see the frequency table as an array:
Freq::relativeFrequencies( array $data )
You can retrieve the frequency table in relative format (percentage):
You can see the frequency table as an array with percentage of the occurrences:
Freq::frequencyTableBySize( array $data , $size)
If you want to create a frequency table based on class (ranges of values) you can use frequencyTableBySize. The first parameter is the array, and the second one is the size of classes.
Calculate the frequency table with classes. Each group size is 4
Freq::frequencyTable()
If you want to create a frequency table based on class (ranges of values) you can use frequencyTable. The first parameter is the array, and the second one is the number of classes.
Calculate the frequency table with 5 classes.
Statistics class
The methods provided by the Freq
and the Stat
classes are mainly static methods.
If you prefer to use an object instance for calculating statistics you can choose to use an instance of the Statistics
class.
So for calling the statistics methods, you can use your object instance of the Statistics
class.
For example for calculating the mean, you can obtain the Statistics
object via the make()
static method, and then use the new object $stat
like in the following example:
Calculate Frequency Table
The Statistics
packages have some methods for generating Frequency Table:
frequencies()
: a frequency is the number of times a value of the data occurs;relativeFrequencies()
: a relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes;cumulativeFrequencies()
: is the accumulation of the previous relative frequencies;cumulativeRelativeFrequencies()
: is the accumulation of the previous relative ratio.
Testing
Changelog
Please see CHANGELOG for more information on what has changed recently.
Contributing
Please see CONTRIBUTING for details.
Security Vulnerabilities
Please review our security policy on how to report security vulnerabilities.
Credits
- Roberto B.
- All Contributors
License
The MIT License (MIT). Please see License File for more information.