Libraries tagged by text extraction
rubix/ml
1490067 Downloads
A high-level machine learning and deep learning library for the PHP language.
nojimage/twitter-text-php
1949416 Downloads
A library of PHP classes that provide auto-linking and extraction of usernames, lists, hashtags and URLs from tweets.
kreuzberg/kreuzberg
163 Downloads
High-performance document intelligence library
apache-solr-for-typo3/tika
636625 Downloads
Apache Tika for TYPO3
silverstripe/textextraction
186531 Downloads
Text Extraction API for SilverStripe CMS (mostly used with 'fulltextsearch' module)
iamgerwin/php-pdf-to-markdown-parser
4343 Downloads
A lightweight PHP library to convert PDF documents into clean, structured Markdown. Supports text extraction, headings, lists, tables, diagrams and code blocks for easier content reuse and publishing.
nlpcloud/nlpcloud-client
25661 Downloads
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, grammar and spelling correction, keywords and keyphrases extraction, chatbot, product description and ad generation, intent classification, text generation, image generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, speech synthesis, embeddings, and dependency parsing. It is ready for production, served through a REST API. This is the PHP client for the API. More details here: https://nlpcloud.com. Documentation: https://docs.nlpcloud.com. Github: https://github.com/nlpcloud/nlpcloud-php
oxide/pdf-oxide
2 Downloads
PDF processing toolkit (Rust-backed, FFI-bound) for PHP
rosette/api
22967 Downloads
PHP Interface for Babel Street Text Analytics
robholmes/term-extractor
5834 Downloads
Term Extractor - a PHP port of Topia's Term Extractor
jcfrane/pdf-text-extractor
185 Downloads
A Laravel PDF text extraction package with multiple strategies (PdfParser, XObject, AWS Textract, Tesseract OCR). Handles Canva-generated PDFs, scanned documents, and other edge cases with automatic fallback.
keyvan/german-ocr
0 Downloads
High-performance German document OCR - Local & Cloud API
kalimeromk/rssfeed
919 Downloads
Full-Text RSS extraction package for Laravel - converts partial RSS feeds to full content
subhashladumor1/laravel-ai-docs
356 Downloads
Laravel AI Document Intelligence & OCR package for Laravel 12 AI SDK. Convert PDF to JSON, extract tables, image to text, Ask PDF with AI, audio transcription and multi-language support using GPT-5.2, Claude and Gemini.
onstage2426/fuzor
33 Downloads
Dependency-free full-text search for PHP. BM25 ranking, fuzzy and boolean modes, search-as-you-type prefix matching, stopword filtering and Snowball stemming for 62 languages, snippet extraction and result highlighting — one SQLite file, zero infrastructure.