Libraries tagged by extraction
kreuzberg/kreuzberg
161 Downloads
High-performance document intelligence for PHP. Extract text, metadata, and structured information from PDFs, Office documents, images, and 75 formats. Powered by Rust core for 10-50x speed improvements.
farzai/color-palette
112397 Downloads
A robust PHP library for extracting, analyzing, and managing color palettes from images
timber/wp-i18n-twig
60781 Downloads
WordPress translations extraction for Twig files with WP-CLI
chamilo/pclzip
156060 Downloads
A PHP library that offers compression and extraction functions for Zip formatted archives
causal/extractor
256750 Downloads
This extension detects and extracts metadata (EXIF / IPTC / XMP / ...) from potentially thousand different file types (such as MS Word/Powerpoint/Excel documents, PDF and images) and bring them automatically and natively to TYPO3 when uploading assets. Works with built-in PHP functions but takes advantage of Apache Tika and other external tools for enhanced metadata extraction.
atrox/matcher
99698 Downloads
Powerful XML and HTML matching and data extraction library
ibexa/jms-translation-bundle
43416 Downloads
Puts the Symfony Translation Component on steroids
articus/data-transfer
31869 Downloads
Library for merging source data to destination data only if destination data remains valid after that
slub/php-mods-reader
20334 Downloads
Read MODS metadata into PHP objects that offer some convenient data extraction methods
silverstripe/textextraction
186339 Downloads
Text Extraction API for SilverStripe CMS (mostly used with 'fulltextsearch' module)
nlpcloud/nlpcloud-client
25601 Downloads
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, grammar and spelling correction, keywords and keyphrases extraction, chatbot, product description and ad generation, intent classification, text generation, image generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, speech synthesis, embeddings, and dependency parsing. It is ready for production, served through a REST API. This is the PHP client for the API. More details here: https://nlpcloud.com. Documentation: https://docs.nlpcloud.com. Github: https://github.com/nlpcloud/nlpcloud-php
marijnvdwerf/material-palette
53712 Downloads
Prominent image colour extraction for PHP
iamgerwin/php-pdf-to-markdown-parser
4191 Downloads
A lightweight PHP library to convert PDF documents into clean, structured Markdown. Supports text extraction, headings, lists, tables, diagrams and code blocks for easier content reuse and publishing.
robholmes/term-extractor
5823 Downloads
Term Extractor - a PHP port of Topia's Term Extractor
subhashladumor1/laravel-ai-docs
355 Downloads
Laravel AI Document Intelligence & OCR package for Laravel 12 AI SDK. Convert PDF to JSON, extract tables, image to text, Ask PDF with AI, audio transcription and multi-language support using GPT-5.2, Claude and Gemini.