Download the PHP package yetidevworks/yetisearch without Composer

On this page you can find all versions of the php package yetidevworks/yetisearch. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package yetisearch

YetiSearch

CI PHP Version Latest Stable Version

A powerful, pure-PHP search engine library with advanced full-text search capabilities, designed for modern PHP applications.

Important: Requires SQLite FTS5 (full‑text search) support in your PHP’s SQLite library. See “Requirements” for a quick check.

Table of Contents

Features

Requirements

Important: SQLite FTS5 required

Installation

Install YetiSearch via Composer:

Quick Start

Example Applications

The examples/ directory contains fully working demonstrations of YetiSearch features:

🏢 Apartment Search Tutorial

Complete real-world example of a property search application:

Run it:

🔍 Other Examples

Usage Examples

Basic Indexing

Advanced Indexing

Search Examples

Multi-Index Search

Search across multiple indexes simultaneously:

Document Management

Configuration

Full Configuration Example

Storage Schema: External-Content (Default)

YetiSearch defaults to an efficient external-content FTS5 schema:

Legacy indices (string id primary key with id_map) continue to work. You can migrate any legacy index to the new schema:

To explicitly create an index with/without external-content:

Advanced Features

Document Chunking

YetiSearch supports both automatic and manual (pre-chunked) document chunking:

Automatic Chunking

Large documents are automatically split into smaller chunks for better search performance:

Pre-chunked Documents (Custom Chunking)

NEW: You can provide your own chunks for better semantic boundaries:

Benefits of pre-chunked documents:

See examples/pre-chunked-indexing.php for a complete example.

Field Boosting and Exact Match Scoring

YetiSearch provides intelligent field-weighted scoring with special handling for exact matches in high-priority fields:

How Field Boosting Works:

  1. Basic Boost Values: Each field's boost value multiplies its relevance score
  2. High-Priority Fields (boost ≥ 2.5): Get special exact match handling:

    • Exact field match: +50 point bonus (e.g., searching "Star Wars" finds a movie titled exactly "Star Wars")
    • Near-exact match: +30 point bonus (ignoring punctuation)
    • Length penalty: Shorter exact matches score higher than longer titles containing the phrase
  3. Phrase Matching: Exact phrases get 15x boost over individual word matches

Example:

This intelligent scoring ensures the most relevant results appear first, with exact matches in important fields (like titles or names) getting priority over partial matches in longer text.

Enhanced Result Ranking (v1.0.3):

For more detailed information about scoring and configuration options, see the Field Boosting and Scoring Guide.

For comprehensive fuzzy search documentation, see the Fuzzy Search Guide.

Multi-language Support

Supported languages:

Custom Stop Words

You can add custom stop words to exclude specific terms from being indexed:

Custom stop words are applied in addition to the default language-specific stop words. They are case-insensitive and apply across all languages.

Geo-Spatial Search

YetiSearch supports location-based searching using SQLite's R-tree spatial indexing:

Geo Utilities:

Indexing with Bounds:

Search Result Deduplication

By default, YetiSearch deduplicates results to show only the best matching chunk per document:

Highlighting

Search results can include highlighted matches:

Fuzzy Search

Enable fuzzy matching for typo tolerance:

Advanced Fuzzy Search Algorithms

YetiSearch supports multiple fuzzy matching algorithms for different use cases:

Available Fuzzy Algorithms:

  1. Trigram (Default) - Best overall accuracy and performance

    • Breaks words into 3-character sequences for matching
    • Excellent for most use cases
    • Good balance of speed and accuracy
  2. Jaro-Winkler - Optimized for short strings

    • Great for names, titles, and short text
    • Favors matches with common prefixes
    • Very fast performance
  3. Levenshtein - Edit distance algorithm
    • Counts insertions, deletions, and substitutions
    • Most flexible but requires term indexing
    • Best for handling complex typos

Configuration Options:

Performance Considerations:

Different algorithms have different performance characteristics:

Term indexing is only performed when fuzzy_algorithm is set to 'levenshtein'. For most use cases, 'trigram' provides the best balance of accuracy and performance.

Query Result Caching

YetiSearch includes built-in query result caching to dramatically improve performance for repeated searches:

Cache Features:

Cache Management:

Performance Impact:

Best Practices:

Multi-Column FTS and Field Weighting

YetiSearch now supports multi-column FTS indexing for superior field weighting and performance:

Benefits of Multi-Column FTS:

Two-Pass Search Strategy

For maximum precision, enable the optional two-pass search:

When to Use Two-Pass Search:

Migration Guide for v2.1.0

Upgrading Existing Indexes

Existing indexes continue to work but won't benefit from multi-column FTS. To upgrade:

Configuration Changes

Update your configuration to use new defaults:

Performance Comparison

Based on A/B testing with real-world data:

Configuration Avg Query Time Relevance Notes
Single-column (legacy) 7.09ms Good Original implementation
Multi-column FTS 6.76ms Excellent Default in v2.1.0
Two-pass search 16.36ms Best Optional for precision
Combined 16.86ms Best Maximum precision

Recommendation: Use the default multi-column FTS for best balance of performance and relevance. Enable two-pass search only when title/heading matches are critical.

Real-World Example: Documentation Search

Here's how the v2.1.0 improvements solve the "scheduler" ranking problem:

Key Improvements Demonstrated:

Performance Optimization Tips:

Algorithm Benchmarking:

YetiSearch includes built-in benchmarking tools to help you choose the best fuzzy algorithm for your use case:

Faceted Search

Get aggregated counts for categories, tags, etc:

Architecture

See the architecture overview diagram and component notes in docs/architecture-overview.md.

Geo Search

YetiSearch supports location filtering and sorting with SQLite R-tree and accurate distances:

Units

Example (PHP):

Note

Global Units & Composite Scoring

Guidance

Distance Facets

Bucket results by distance from a point to power UI filters.

k‑Nearest Neighbors (k‑NN)

Return the k nearest documents by distance, optionally clamped by max distance:

YetiSearch follows a modular architecture with clear separation of concerns:

Key Components

Testing

YetiSearch includes comprehensive test coverage. Run tests using various commands:

Basic Testing

Coverage Reports

Filtered Testing

Advanced Testing

Static Analysis

API Reference

YetiSearch Class

Document Structure

Documents are represented as associative arrays with the following structure:

Content vs Metadata

Understanding the distinction between content and metadata fields:

Content Fields:

Metadata Fields:

When to use metadata:

This separation improves performance (less data to index), prevents false matches (searching "42" won't find products with 42 in stock), and keeps your search index focused on actual searchable content.

SearchQuery Model

Result Structure

Search results are returned as an associative array:

Performance Tips

  1. Index Configuration

    • Use appropriate field boosts - don't over-boost
    • Only index fields you need to search
    • Use metadata for non-searchable data
    • Configure reasonable chunk sizes (default 1000 chars works well)
  2. Search Optimization

    • Use field-specific searches when possible: inFields(['title'])
    • Enable unique_by_route (default) to avoid duplicate documents
    • Use filters instead of text queries for exact matches
    • Limit results with reasonable page sizes
  3. Storage Optimization
    • Run optimize() periodically on large indexes
    • Use WAL mode for better concurrency (default)
    • Consider separate indexes for different content types

Error Handling

Performance

YetiSearch is designed for high performance with minimal resource usage. Here are real-world benchmarks and performance characteristics.

Benchmark Results

Tested on M4 MacBook Pro with PHP 8.3, using a dataset of 32,000 movies:

Indexing Performance

Operation Performance Details
Document Indexing ~4,360 docs/sec Without fuzzy term indexing
With Levenshtein ~1,770 docs/sec With term indexing for fuzzy search
Batch Processing 250 docs/batch Optimal batch size
Memory Usage ~60MB For 32k documents

Search Performance

Query Type Response Time Details
Simple Search 2-5ms Single term, no fuzzy
Phrase Search 3-8ms Multi-word queries
Fuzzy Search (Trigram) 5-15ms Default algorithm
Fuzzy Search (Levenshtein) 10-30ms Most accurate
Complex Queries 15-50ms With filters, facets, geo

Real-World Example

From the movie database benchmark:

Performance Characteristics

1. Linear Scalability

2. Memory Efficiency

3. Disk I/O Optimization

Performance Tuning

For Maximum Indexing Speed

For Fastest Searches

For Best Accuracy

Bottlenecks and Solutions

Bottleneck Impact Solution
Large documents Slow indexing Increase chunk_size
Many small documents I/O overhead Increase batch_size
Complex queries Slow searches Add specific indexes
Fuzzy search CPU intensive Use trigram or basic algorithm
High concurrency Lock contention Enable WAL mode

Comparison with Other Solutions

Feature YetiSearch Elasticsearch MeiliSearch TNTSearch
Setup Time < 1 min 10-30 min 5-10 min < 1 min
Memory Usage 50-200MB 1-4GB 200MB-1GB 100-500MB
Dependencies PHP only Java + Service Binary/Docker PHP only
Index Speed 4,500/sec 10,000/sec 5,000/sec 2,000/sec
Search Speed 1-30ms 5-50ms 10-100ms 5-40ms

Best Practices for Performance

  1. Index Design

    • Create separate indexes for different content types
    • Use appropriate field boosts
    • Only index searchable content
  2. Query Optimization

    • Use field-specific searches when possible
    • Limit results appropriately
    • Enable result caching for repeated queries
  3. Maintenance

    • Run optimize() during low-traffic periods
    • Monitor index size and split if needed
    • Clear old cache entries periodically
  4. Hardware Considerations
    • SSD storage recommended for large indexes
    • More RAM allows larger caches
    • Multi-core CPUs benefit batch operations

Type-Ahead Setup

For as-you-type search, enable fuzzy matching and (optionally) last-token prefixing. Debounce input by 200–300ms on the client.

CLI Demo

Try an interactive demonstration that seeds a small dataset, prints suggestions, and shows as‑you‑type results:

Or run a single query:

Notes

Weighted FTS and Prefix (Optional)

You can enable multi-column FTS5 and weighted BM25 to boost important fields (e.g., title, tags). Prefix indexing improves strict prefix matches for type-ahead.

Migration helper:

Suggestions

Use suggest(index, term, options) to power a dropdown for type‑ahead. Suggestions are ranked by frequency across fuzzy variants and boosted when the title contains or starts with the variant.

Tips

Synonyms

Enable query‑time synonyms expansion to improve recall for known aliases and abbreviations.

Config (array or JSON file):

Behavior

DSL (Domain Specific Language)

YetiSearch now supports a powerful DSL for building complex queries with multiple syntaxes. For comprehensive documentation with migration guide and advanced examples, see docs/DSL.md.

Natural Language Query Syntax

Write queries using SQL-like syntax:

JSON API-Compliant URL Parameters

Support for standard REST API query patterns:

Example URL: ?filter[category][eq]=tech&filter[tags][in]=go,php&sort=-date&page[limit]=10

Fluent PHP Interface

Build queries programmatically:

DSL Features

Metadata Fields

YetiSearch distinguishes between content (searchable text) and metadata (filterable attributes):

Common metadata fields like author, status, price, views, etc. are automatically recognized. See docs/DSL.md for complete documentation.

CLI

A simple CLI is included for quick testing of search, suggestions, geo nearest, and distance facets.

Install deps if you haven't:

Examples

Synonyms example

Common flags

Future Feature Ideas

The following features are ideas for future releases:

Index Management Enhancements

Language and Analysis

Search Enhancements

Performance and Scalability

Integration Features

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Run tests (composer test:verbose)
  4. Commit your changes (git commit -m 'Add amazing feature')
  5. Push to the branch (git push origin feature/amazing-feature)
  6. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Credits

YetiSearch is maintained by the YetiSearch Team and contributors.

Special thanks to:


All versions of yetisearch with dependencies

PHP Build Version
Package Version
Requires php Version ^7.4|^8.0|^8.1|^8.2|^8.3|^8.4
ext-json Version *
ext-mbstring Version *
ext-pdo Version *
ext-sqlite3 Version *
psr/log Version ^1.1
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package yetidevworks/yetisearch contains the following files

Loading the files please wait ....