Download the PHP package sanmai/sliding-window-counter without Composer
On this page you can find all versions of the php package sanmai/sliding-window-counter. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download sanmai/sliding-window-counter
More information about sanmai/sliding-window-counter
Files in sanmai/sliding-window-counter
Package sliding-window-counter
Short Description Short-lived cache-backed time series with anomaly detection
License GPL-2.0-or-later Apache-2.0
Informations about the package sliding-window-counter
Sliding Window Counter
Short-lived cache-backed time series with anomaly detection
A lightweight, efficient PHP library for tracking time-based events and detecting anomalies without the overhead of databases or logs.
Table of Contents
- Overview
- Features
- How It Works
- Installation
- Quick Start
- Setting up a counter
- Tracking events
- Detecting unusual activity
- Getting more stats
- Advanced Usage
- Cache Adapters
- Technical Details
- Contributing
- License
Installation
What's this all about?
Ever needed to track how many times something happens over time and spot when those numbers get weird? That's what this library does, and it does it efficiently.
Real-world example: Imagine you want to detect when suspicious messages from specific IP ranges suddenly spike. Instead of digging through logs or querying databases, this library uses in-memory caching to track events and spot unusual patterns before it is too late.
Features
- Lightweight - Uses your existing cache infrastructure
- Fast - No database queries or log parsing
- Robust anomaly detection - Based on standard deviations
- Flexible time windows - Configure to your needs
- Production-ready - Originally developed for Tumblr
How it works (the simple version)
- Divide time into buckets - We slice time into equal chunks (like 5-minute windows or hourly buckets)
- Count events in cache - Each event increments a counter in the appropriate time bucket
- Create time series on demand - When needed, we assemble these buckets into a continuous series
- Apply statistical analysis - We calculate mean, standard deviation, and detect outliers
The library handles all the tricky parts like:
- What happens when current time doesn't perfectly align with your time buckets
- Calculating meaningful statistics on the fly
- Determining what counts as "unusual" activity (with adjustable sensitivity)
Quick Start
Setting up a counter
Tracking events
Detecting unusual activity
Getting more stats
Adjusting Sensitivity
You can control how sensitive the anomaly detection is by specifying the number of standard deviations that define "normal":
A quick stats refresher:
- 1 standard deviation: ~68% of normal values in this range (very sensitive)
- 2 standard deviations: ~95% of normal values in this range (still very sensitive)
- 3 standard deviations: ~99.7% of normal values in this range (fairly sensitive)
- 5 standard deviations: ~99.99994% of normal values in this range (1 in ~1.7 million chance)
Five standard deviations from the mean is a definite anomaly: there's only a ~0.000057% chance that a data point this extreme occurs by random chance under the null hypothesis.
Available Cache Adapters
The library supports multiple caching backends through a simple adapter interface. An example using regular Memcached:
Creating Your Own Adapter
Need to use a different cache system? Implementing a custom adapter is straightforward:
Technical Details (for the curious)
The library uses an elegant sliding window approach to time series data. Here's how it works under the hood:
- Material frames: The actual cached data buckets aligned to window boundaries
- Logical frames: Windows aligned to the current time (which may overlap multiple material frames)
When calculating values for logical frames that don't perfectly align with material frames, we perform weighted extrapolation to ensure smooth transitions in the time series.
Consider these two scenarios:
- Perfectly aligned frames: When the query time aligns with cache bucket boundaries, we can use the raw values directly.
- Misaligned frames: When the query time doesn't align with cache boundaries, we extrapolate values based on overlapping portions.
For a more detailed explanation of the internal workings, check out this Cloudflare blog post which explains a similar approach.
License
This library is dual-licensed under the GNU General Public License v2.0 or later and the Apache License 2.0. You may choose either license to govern your use of this software.
- For GPL-2.0-or-later license terms, see the LICENSE-GPL file
- For Apache-2.0 license terms, see the LICENSE file
When using this library, you must comply with the terms of at least one of these licenses.
All contributions to this project have been reviewed and confirmed by the respective authors as dual-licensed. If you believe your code was included without proper attribution or license representation, please contact us and we'll address it immediately.