Libraries tagged by HTML Extractor
atrox/matcher
93841 Downloads
Powerful XML and HTML matching and data extraction library
jkphl/micrometa
159428 Downloads
A meta parser for extracting micro information out of web documents, currently supporting Microformats 1+2, HTML Microdata, RDFa Lite 1.1 and JSON-LD
helgesverre/receipt-scanner
6748 Downloads
Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.
lysice/php-simple-html-dom-parser
30465 Downloads
Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.
nilgems/laravel-textract
3747 Downloads
A Laravel package to extract text from files like DOC, XL, Image, Pdf and more. I've developed this package by inspiring "npm textract".
hexydec/htmldoc
8487 Downloads
A token based HTML document parser and minifier. Minify HTML documents including inline CSS, Javascript, and SVG's on the fly. Extract document text, attributes, and fragments. Full test suite.
linclark/microdata-php
42365 Downloads
Extracts microdata from HTML using PHP.
crwlr/schema-org
13825 Downloads
Extract schema.org structured data from HTML documents.
grom/tube-link
14351 Downloads
Extract video/music information from any URL and render HTML
dotpack/php-boiler-pipe
4843 Downloads
PhpBoilerPipe. Boilerplate Removal and Fulltext Extraction from HTML pages
aspose/pdf-sdk-php
24863 Downloads
Aspose.PDF Cloud is a REST API for creating and editing PDF files. It can also be used to convert PDF files to different formats like DOC, HTML, XPS, TIFF and many more. Aspose.PDF Cloud gives you control: create PDFs from scratch or from HTML, XML, template, database, XPS or an image. Render PDFs to image formats such as JPEG, PNG, GIF, BMP, TIFF and many others. Aspose.PDF Cloud helps you manipulate elements of a PDF file like text, annotations, watermarks, signatures, bookmarks, stamps and so on. Its REST API also allows you to manage PDF pages by using features like merging, splitting, and inserting. Add images to a PDF file or convert PDF pages to images.
lavatech/php-simple-html-dom-parser
15016 Downloads
Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.
itul/php-simple-html-dom-parser
2193 Downloads
This is a modified version to work with PHP 7.4+. Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.
p1ho/accessibility-checker
540 Downloads
Accessibility Testing Suite on raw HTML extracted from Content Management Systems
nfservice/doc-cfe
470 Downloads
Cria extrato da CF-e para exportar em PDF e HTML