Libraries tagged by Text extractor
antonizer/pdfparser
437 Downloads
Pdf parser library. Can read and extract information from pdf file. Fork from https://github.com/smalot/pdfparser
hocvt/php-apache-tika
1543 Downloads
Apache Tika bindings for PHP: extracts text from documents and images (with OCR), metadata and more...
venveo/craft-documentsearch
5305 Downloads
Extract the contents of text documents and add to Craft's search index
hejunjie/address-parser
621 Downloads
收货地址智能解析工具,支持从非结构化文本中提取姓名、手机号、身份证号、省市区、详细地址等字段,适用于电商、物流、CRM 等系统 | An intelligent address parser that extracts name, phone number, ID number, region, and detailed address from unstructured text—perfect for e-commerce, logistics, and CRM systems.
jbpapp/pdf-to-text
872 Downloads
Extract text from a pdf file using pdf-to-text binary.
jaybizzle/doc-to-text
9813 Downloads
Extract text from a Word Doc
dragomirt/pdf-to-text
3787 Downloads
Extract text from a pdf
becklyn/search-text-transformer
930 Downloads
A library that extracts plain text from HTML for usage in search engines (like Elasticsearch)
manofstrong/sitescrapper
71 Downloads
A Package to Scrape Websites from their Sitemaps and Extract Relevant Content from the Webpage and Upload to a Database
watson-developer-cloud/php-sdk
244 Downloads
Client library to use the IBM Watson Services
rkw/rkw-pdf2content
131 Downloads
Extract text from PDFs and create TYPO3 sites with it!
sukohi/laravel-readability
201 Downloads
A Laravel package to extract readable text from HTML.
aramonc/docblock-parser
27 Downloads
Parses strings for docBlock like portions and then extracts the annotations, descriptions, and optional document content. This should not be used as an annotation parser for PHP code, at least not on it's own. If you're looking to do something with the docBlocks you might want to use something like https://github.com/schmittjoh/metadata better. This is more for if you're trying to get metadata from a plain text file. Look through the tests for examples.
xatham/text-extraction
17 Downloads
Easy text extraction for many different file types
quangvule/pdf-to-text
17 Downloads
Extract text from a pdf