Download the PHP package jakhotiya/symspell-php without Composer
On this page you can find all versions of the php package jakhotiya/symspell-php. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download jakhotiya/symspell-php
More information about jakhotiya/symspell-php
Files in jakhotiya/symspell-php
Package symspell-php
Short Description Spelling correction & fuzzy search: 1 million times faster through Symmetric Delete spelling correction algorithm
License MIT
Homepage https://github.com/jakhotiya/symspell-php
Informations about the package symspell-php
SymSpell PHP
Spelling correction & Fuzzy search: 1 million times faster through Symmetric Delete spelling correction algorithm
A complete PHP port of the SymSpell library - the world's fastest spelling correction & fuzzy search library.
Features
✅ Ultra-Fast Spelling Correction - 1 million times faster than traditional algorithms
✅ Word Segmentation - Split concatenated words ("thequickbrownfox"
→ "the quick brown fox"
)
✅ Compound Correction - Multi-word spelling correction with context awareness
✅ Multi-Language Support - Includes dictionaries for 8+ languages
✅ CLI Interface - Command-line tool with pipes and redirects support
✅ Complete API - All original SymSpell functionality ported to PHP
Quick Start
Installation
Basic Usage
Core Algorithms
1. Single Word Correction
Fast spelling correction for individual words using the Symmetric Delete algorithm:
2. Word Segmentation
Triangular Matrix Algorithm - O(n) runtime complexity for splitting concatenated words:
3. Compound Correction
Multi-word spelling correction with compound splitting/merging:
Demo Applications
The package includes four demo applications showcasing different features:
1. Basic Demo (Single Word Correction)
Interactive spell checker - type words and get suggestions.
2. Word Segmentation Demo
Split concatenated words:
- Input:
thequickbrownfoxjumps
- Output:
the quick brown fox jumps
3. Compound Correction Demo
Multi-word spelling correction with context awareness.
4. Command Line Interface
CLI Parameters:
DictionaryType
:load
(load from file) orcreate
(from corpus)DictionaryPath
: Path to dictionary filePrefixLength
: 5-7 (memory/speed trade-off)LookupType
:lookup
|lookupcompound
|wordsegment
MaxEditDistance
: Maximum edit distance (default: 2)OutputStats
:true
/false
- show distance and frequencyVerbosity
:Top
|Closest
|All
Dictionaries
📚 Dictionary Customization Guide - Learn how to add words, create custom dictionaries, and build domain-specific vocabularies.
The package includes comprehensive dictionaries:
English Dictionaries (Included)
frequency_dictionary_en_82_765.txt
- 82,765 English words with frequenciesfrequency_bigramdictionary_en_243_342.txt
- 243,342 English bigrams
Multi-Language Dictionaries (Included)
- 🇺🇸 English (en-80k.txt) - 80,000 words
- 🇩🇪 German (de-100k.txt) - 100,000 words
- 🇫🇷 French (fr-100k.txt) - 100,000 words
- 🇪🇸 Spanish (es-100l.txt) - 100,000 words
- 🇮🇹 Italian (it-100k.txt) - 100,000 words
- 🇷🇺 Russian (ru-100k.txt) - 100,000 words
- 🇮🇱 Hebrew (he-100k.txt) - 100,000 words
- 🇨🇳 Chinese (zh-50k.txt) - 50,000 words
Dictionary Format
Plain UTF-8 text files with format: word frequency
Performance
Speed Benchmarks
- Single word lookup: ~0.3ms per word
- Word segmentation: ~0.2ms for typical inputs
- Dictionary loading: ~50ms for 82K words
Memory Usage
- Dictionary: ~7MB for 82K English words
- Runtime: Minimal additional memory overhead
- Optimization: Use
prefixLength=5
for lower memory usage
API Reference
Core Classes
SymSpell
Main spell correction class.
Constructor:
Methods:
SuggestItem
Represents a spelling suggestion.
SegmentationItem
Represents word segmentation result.
Verbosity
Enum
Controls number of suggestions returned.
Algorithm Details
Symmetric Delete Algorithm
SymSpell uses a revolutionary approach:
- Traditional: Generate all possible edits for input word (millions of variations)
- SymSpell: Pre-generate only deletions for dictionary words (25 deletions vs 3 million edits)
Result: 1,000,000x speed improvement over traditional methods.
Triangular Matrix Word Segmentation
- Runtime: O(n) linear complexity
- Method: Dynamic programming without recursion
- Optimization: Circular buffer for memory efficiency
- Scoring: Naive Bayes probability using real word frequencies
Edit Distance
Supports multiple algorithms:
- Levenshtein: Insertions, deletions, substitutions
- Damerau-OSA: Includes transpositions
- Optimized: Early termination for performance
Testing
Run the test suite:
Test Coverage:
- ✅ 10/11 core algorithm tests passing
- ✅ Word frequency management
- ✅ Edit distance calculations
- ✅ Verbosity controls
- ✅ Count thresholds
- ✅ Overflow protection
- 🔄 Performance test (4,955 expected results)
Requirements
- PHP: 8.0+ (for enums and strict typing)
- Extensions:
mbstring
(for UTF-8 support) - Memory: ~50MB for full English dictionary
- Disk: ~175MB for all included dictionaries
License
MIT License - see LICENSE file.
Credits
- Original SymSpell: Wolf Garbe
- PHP Port: Jakhotiya
- Algorithm: Symmetric Delete spelling correction
Applications
Perfect for:
- 🔍 Search engines - Query correction and fuzzy matching
- 📝 Text editors - Real-time spell checking
- 🤖 Chatbots - Understanding misspelled user input
- 📊 OCR systems - Post-processing scanned text
- 🌐 Web forms - User input validation and suggestion
- 🧬 Bioinformatics - DNA sequence analysis
- 🈳 CJK text processing - Chinese/Japanese/Korean segmentation
⚡ Experience the world's fastest spelling correction in PHP! ⚡
All versions of symspell-php with dependencies
ext-mbstring Version *