Libraries tagged by HTML Extractor
j0k3r/php-readability
875448 Downloads
Automatic article extraction from HTML
vstelmakh/url-highlight
940423 Downloads
Library to parse urls from string input
jkphl/micrometa
170933 Downloads
A meta parser for extracting micro information out of web documents, currently supporting Microformats 1+2, HTML Microdata, RDFa Lite 1.1 and JSON-LD
atrox/matcher
99901 Downloads
Powerful XML and HTML matching and data extraction library
crwlr/schema-org
26007 Downloads
Extract schema.org structured data from HTML documents.
helgesverre/receipt-scanner
8989 Downloads
Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.
lysice/php-simple-html-dom-parser
49943 Downloads
Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.
drnxloc/laravel-simple-html-dom
374114 Downloads
Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.
nilgems/laravel-textract
5707 Downloads
A Laravel package to extract text from files like DOC, XL, Image, Pdf and more. I've developed this package by inspiring "npm textract".
hexydec/htmldoc
10882 Downloads
A token based HTML document parser and minifier. Minify HTML documents including inline CSS, Javascript, and SVG's on the fly. Extract document text, attributes, and fragments. Full test suite.
aspose/pdf-sdk-php
28517 Downloads
Aspose.PDF Cloud is a REST API for creating and editing PDF files. It can also be used to convert PDF files to different formats like DOC, HTML, XPS, TIFF and many more. Aspose.PDF Cloud gives you control: create PDFs from scratch or from HTML, XML, template, database, XPS or an image. Render PDFs to image formats such as JPEG, PNG, GIF, BMP, TIFF and many others. Aspose.PDF Cloud helps you manipulate elements of a PDF file like text, annotations, watermarks, signatures, bookmarks, stamps and so on. Its REST API also allows you to manage PDF pages by using features like merging, splitting, and inserting. Add images to a PDF file or convert PDF pages to images.
linclark/microdata-php
43151 Downloads
Extracts microdata from HTML using PHP.
grom/tube-link
17211 Downloads
Extract video/music information from any URL and render HTML
mlocati/chm-lib
392 Downloads
Read CHM (Microsoft Compiled HTML Help) files
bitandblack/document-crawler
327 Downloads
Extract different parts of an HTML or XML document.