Download the PHP package niktomo/weighted-sample without Composer
On this page you can find all versions of the php package niktomo/weighted-sample. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download niktomo/weighted-sample
More information about niktomo/weighted-sample
Files in niktomo/weighted-sample
Package weighted-sample
Short Description Weighted random sampling for PHP 8.2+. Supports repeatable draws (WeightedPool) and box-gacha limited inventory (BoxPool) with alias-method O(1) picking and cryptographically secure randomness by default.
License MIT
Homepage https://github.com/niktomo/weighted-sample
Informations about the package weighted-sample
weighted-sample
日本語版 README はこちら
Weighted random sampling for PHP 8.2+. Two pool types cover the most common use cases: repeatable draws and box-gacha style limited inventory.
Installation
Pool Types
| Class | Draw behavior | Exhaustible |
|---|---|---|
WeightedPool |
Always draws from the full set | No |
BoxPool |
Each item has a finite stock count; stock is decremented on draw | Yes |
All pools default to SecureRandomizer (\Random\Engine\Secure) — cryptographically safe with no configuration required.
Inject SeededRandomizer(int $seed) only for tests or reproducible simulations.
WeightedPool
Immutable pool. draw() always selects from all items according to their weights.
Suitable for gacha, lotteries, and any scenario where items can be drawn repeatedly.
Items can be any shape — associative arrays, objects, or scalars. The second argument is a closure that extracts the integer weight from each item:
The closure can reference any property or method:
BoxPool
Each item has a finite stock count. Drawing decrements the stock; when it reaches zero
the item is removed from the pool. Exhaustion throws EmptyPoolException.
Use case: Box gacha — a closed set of prizes where each prize appears a fixed number of times. A single "item type" can be drawn multiple times until its stock runs out.
BoxPool requires two closures: one for the draw weight, one for the stock count. Items can be any shape — just point the closures at the right fields:
drawMany()
drawMany(int $count) draws up to $count items in one call.
If the pool empties before $count is reached, it returns only what was drawn — no exception.
Note:
BoxPoolis in-memory only. To resume across requests, persist the remaining stock counts and reconstruct the pool from those on the next request.
How BoxPool manages stock internally:
- On construction, BoxPool copies the items and records each stock count internally.
- Each
draw()decrements the internal stock counter for the selected item. - When a counter reaches zero, that item type is removed from the pool.
- The array you passed to
BoxPool::of()is never touched. BoxPoolis mutable by design: eachdraw()modifies internal state. It cannot be shared across concurrent contexts without external synchronization.
Item Filtering
Filtering is applied during of() construction — excluded items are never part of the pool
and cannot be drawn at runtime.
By default, items with weight ≤ 0 are silently excluded. BoxPool additionally excludes
items with stock ≤ 0. If all items are excluded, InvalidArgumentException is thrown.
Default: PositiveValueFilter (silent exclusion)
StrictValueFilter (throw on invalid)
CompositeFilter (combine multiple filters)
Custom filter for BoxPool
BoxPool requires CountedItemFilterInterface, which extends ItemFilterInterface with
an acceptsWithCount() method that receives the stock count as well as the weight.
Note: When CompositeFilter is used with BoxPool, inner filters that implement only
ItemFilterInterface (not CountedItemFilterInterface) will have acceptsWithCount() fall back
to accepts() — the stock count is ignored by those filters. If count-based exclusion is needed,
each inner filter must implement CountedItemFilterInterface.
Exception types
| Exception | When thrown |
|---|---|
InvalidArgumentException |
Construction: of() received no valid items (all filtered out) |
EmptyPoolException |
Runtime: draw() called on an exhausted pool |
Construction failure and runtime exhaustion are different failure modes — InvalidArgumentException means bad input was given to of(), while EmptyPoolException means the pool ran out during use.
Streaming large item sets (iterable support)
All of() methods accept any iterable — not just arrays.
Pass a generator to avoid holding the source collection and the pool in memory at the same time.
Note: the pool itself stores all accepted items internally, so peak memory after construction is proportional to the number of accepted items.
Seeded Randomizer
Inject a seeded randomizer for deterministic results — useful in tests and simulations.
The default (no seed) uses \Random\Engine\Secure for cryptographically safe randomness.
Warning:
SeededRandomizerusesMt19937, which is not cryptographically secure. Before using it, confirm all of the following:
- [ ] This code runs only in tests or offline simulations — never in a live request handler
- [ ] The seed is not derived from
time(),microtime(), or any timestamp (brute-forceable in < 1 s)- [ ] The seed is not shared across users or requests (same seed → identical sequence)
- [ ] Users cannot observe enough outputs to reconstruct the state (624 consecutive 32-bit outputs suffice)
Why 624 outputs matter: Mt19937 maintains a 624-element internal state. An attacker who collects 624 consecutive 32-bit outputs (e.g. from repeated API responses containing draw results) can fully reconstruct the internal state using the MT19937 state-recovery algorithm, and then predict every future output. Even partial observation (e.g. the item tier rather than the raw index) significantly narrows the search space.
If any box is unchecked, use
SecureRandomizer(the default) instead.
Safe seed derivation (offline simulations only): derive the seed from a high-entropy source rather than a sequential counter or timestamp.
Customization
Every key behaviour is driven by an interface — swap any component via named parameters. You can implement these interfaces yourself to plug in custom logic (feature-flag-aware filters, database-backed randomizers, ML-weighted selectors, etc.).
| Parameter | Interface | Default | Why you'd change it |
|---|---|---|---|
randomizer: |
RandomizerInterface |
SecureRandomizer (OS CSPRNG) |
Use SeededRandomizer for tests / reproducible simulations |
filter: |
ItemFilterInterface<T> |
PositiveValueFilter |
Add custom exclusion rules; combine with CompositeFilter |
selectorFactory: (WeightedPool) |
SelectorFactoryInterface |
PrefixSumSelectorFactory |
Switch to AliasTableSelectorFactory for O(1) picks on large pools with many draws |
bundleFactory: (BoxPool) |
SelectorBundleFactoryInterface |
FenwickSelectorBundleFactory |
Swap builder strategy; RebuildSelectorBundleFactory for tiny pools (N ≤ 50) |
Selector Algorithms
Three selection algorithms are available. Swap them via selectorFactory (WeightedPool) or
bundleFactory (BoxPool).
Quick decision guide
Start with the defaults. Change only if you have measured a bottleneck.
| Pool | Item count (N) | Draws per pool lifetime | Recommendation |
|---|---|---|---|
WeightedPool |
any | few (< ~60) | PrefixSumSelectorFactory — default, best all-rounder |
WeightedPool |
any | many (≥ ~60) | AliasTableSelectorFactory — O(1) pick, cost amortised over many draws |
BoxPool |
any | — | FenwickSelectorBundleFactory — default, O(log n) update scales to 10,000+ items |
BoxPool |
≤ 50 | — | RebuildSelectorBundleFactory(new AliasTableSelectorFactory()) — O(1) pick between rare exhaustions |
High-load / large N?
FenwickSelectorBundleFactoryis the right answer forBoxPool. Each item exhaustion costs O(log n) — not O(n) — so total draw cost stays O(n log n) even at N=10,000. See the FenwickBuilder speedup table below.
Complexity reference
| Selector / Factory | Build | Pick | Update (on exhaustion) |
|---|---|---|---|
PrefixSumSelectorFactory (default for WeightedPool) |
O(n) | O(log n) | O(n) full rebuild |
AliasTableSelectorFactory |
O(n) | O(1) | O(n) full rebuild |
FenwickSelectorBundleFactory (default for BoxPool) |
O(n) | O(log n) | O(log n) point update |
RebuildSelectorBundleFactory |
O(n) | O(log n)† | O(n) full rebuild |
† O(1) if paired with AliasTableSelectorFactory.
FenwickSelectorBundleFactorypairs aFenwickTreeSelectorwith aFenwickSelectorBuilderthat share the same instance. When an item is exhausted,subtract()callsupdate(index, 0)on the shared selector in O(log n) — no object creation, no array copy.
All algorithms use integer-only arithmetic — no float rounding errors.
When to use each selector
PrefixSumSelector — default for WeightedPool
Pros: Low build cost, O(log n) pick, works for any WeightedPool use case.
Cons: Pick grows slowly with N (still < 0.4 µs at N=1000).
Use this unless you have measured a need for faster picks. Rebuild on every WeightedPool::of() call is fine — build takes < 1 µs for small N.
AliasTableSelector — O(1) pick for WeightedPool with many draws
Pros: True O(1) pick regardless of N — throughput stays flat even at N=50,000.
Cons: Build cost is ~2.5× higher than PrefixSum. Only pays off after enough draws.
Build cost amortises once you exceed the break-even draw count:
| Items | Break-even draws |
|---|---|
| 50 | ~36 |
| 100 | ~63 |
| 1000 | ~434 |
If your pool is rebuilt per request (or per user session with only a handful of draws),
AliasTableSelectornever recoups its build cost — usePrefixSumSelectorinstead.Not suitable for BoxPool with large N. Each item exhaustion triggers a full O(n) rebuild (no incremental
update()support), giving O(n²) total cost. For N ≤ ~50 it is acceptable; for N ≥ 500 useFenwickSelectorBundleFactory(the default).
FenwickTreeSelector / FenwickSelectorBundleFactory — default for BoxPool
Pros: O(log n) point update on exhaustion — no full rebuild. Total draw-all cost is O(n log n), not O(n²). Scales to N=10,000+.
Cons: Pick is slightly slower than PrefixSum for immutable pools (Fenwick tree descent vs binary search).
FenwickTreeSelector is mutable — it updates its internal Fenwick tree in-place on each subtract(). This is intentional: in-place mutation enables O(log n) updates without object allocation, which is what makes BoxPool efficient at scale.
FenwickSelectorBundleFactory pairs a FenwickTreeSelector with a FenwickSelectorBuilder that share the same instance. When an item is exhausted, subtract() calls update(index, 0) on the shared tree in O(log n) — no object creation, no array copy.
Measured speedup vs RebuildSelectorBundleFactory (total draw-all time, µs per item):
| N | RebuildBuilder (µs/item) | FenwickBuilder (µs/item) | Speedup |
|---|---|---|---|
| 100 | ~3.0 | ~0.8 | ~3.7x |
| 500 | ~10.4 | ~0.9 | ~12.0x |
| 1000 | ~19.6 | ~0.9 | ~21.4x |
| 5000 | ~94.2 | ~1.0 | ~91.0x |
For immutable
WeightedPool,FenwickTreeSelectorpick is slightly slower thanPrefixSumSelector— usePrefixSumSelectororAliasTableSelectorforWeightedPool.
RebuildSelectorBundleFactory — BoxPool with tiny N
Pros: Works with any SelectorFactory including AliasTableSelectorFactory (O(1) picks between exhaustions).
Cons: Full O(n) selector rebuild on every item exhaustion. At N=500 that is ~12× slower than Fenwick.
Use only for very small pools (N ≤ ~50) where rebuild overhead is negligible,
and only if you specifically need O(1) picks from AliasTableSelectorFactory.
For everything else, FenwickSelectorBundleFactory (the default) is the right choice.
Overflow constraints:
PrefixSumSelector/FenwickTreeSelector: total weight W must not exceedPHP_INT_MAXAliasTableSelector: additionally requires(n+1) × W ≤ PHP_INT_MAX(n = item count; the +1 is headroom for Vose's algorithm intermediate values)
Speed comparison — 1,000,000 picks each (range(1, N) weights)
Measured on PHP 8.4 / Darwin (macOS). Results vary by hardware and PHP configuration. Run
php benchmark/bench.phpto generate results for your own environment.
Alias throughput stays nearly constant regardless of item count (true O(1)).
PrefixSum and Fenwick grow as O(log n). For immutable pools, PrefixSum is faster than Fenwick.
Run php benchmark/compare.php for full break-even analysis and BoxPool scaling charts.
Accuracy comparison — max deviation (N×500 draws, uniform weights, seed=99)
All three algorithms are statistically equivalent — deviations are pure sampling noise, not algorithmic error. Neither algorithm can produce a result with zero probability.
Benchmark results
Measured on PHP 8.4.19 / Darwin (macOS). Run
php benchmark/run.phpfor your own environment.
WeightedPool — 1,000,000 draws (SSR=1%, SR=9%, R=90%)
WeightedPool — 100 items (weight 1–100), 1,000,000 draws
Run the full benchmark:
Type Safety
Full PHPStan level 8 compliance. The @template T generic lets static analysis track
the exact return type of draw() through the pool.
Filter interfaces are also generic — ItemFilterInterface<T> and CountedItemFilterInterface<T>
carry the item type through static analysis.
Requirements
- PHP 8.2+
- No runtime dependencies
License
MIT