Libraries tagged by Text extractor
smalot/pdfparser
22467283 Downloads
Pdf parser library. Can read and extract information from pdf file.
spatie/pdf-to-text
4090901 Downloads
Extract text from a pdf
rubix/ml
859454 Downloads
A high-level machine learning and deep learning library for the PHP language.
nojimage/twitter-text-php
1744069 Downloads
A library of PHP classes that provide auto-linking and extraction of usernames, lists, hashtags and URLs from tweets.
vaites/php-apache-tika
1205399 Downloads
Apache Tika bindings for PHP: extracts text from documents and images (with OCR), metadata and more...
crodas/text-rank
53138 Downloads
Extract relevant keywords from a given text
apache-solr-for-typo3/tika
498849 Downloads
Apache Tika for TYPO3
helgesverre/receipt-scanner
1561 Downloads
Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.
aymanrb/php-unstructured-text-parser
16737 Downloads
A PHP library to help extract text out of text documents
flow-php/etl-adapter-text
2550 Downloads
PHP ETL - Adapter - Text
aspose/pdf-sdk-php
22452 Downloads
Aspose.PDF Cloud is a REST API for creating and editing PDF files. It can also be used to convert PDF files to different formats like DOC, HTML, XPS, TIFF and many more. Aspose.PDF Cloud gives you control: create PDFs from scratch or from HTML, XML, template, database, XPS or an image. Render PDFs to image formats such as JPEG, PNG, GIF, BMP, TIFF and many others. Aspose.PDF Cloud helps you manipulate elements of a PDF file like text, annotations, watermarks, signatures, bookmarks, stamps and so on. Its REST API also allows you to manage PDF pages by using features like merging, splitting, and inserting. Add images to a PDF file or convert PDF pages to images.
silverstripe/textextraction
157226 Downloads
Text Extraction API for SilverStripe CMS (mostly used with 'fulltextsearch' module)
ottosmops/pdftotext
122103 Downloads
Extract text from PDF
cpierce/pdf2text
14119 Downloads
Client library for extracting text form PDF
hexydec/htmldoc
7121 Downloads
A token based HTML document parser and minifier. Minify HTML documents including inline CSS, Javascript, and SVG's on the fly. Extract document text, attributes, and fragments. Full test suite.