Libraries tagged by craw
vipnytt/robotstxtparser
593519 Downloads
Robots.txt parsing library, with full support for every directive and specification.
duzun/hquery
96036 Downloads
An extremely fast web scraper that parses megabytes of HTML in a blink of an eye. No dependencies. PHP5+
gsouf/chromium
503 Downloads
Instrument headless chrome/chromium instances from PHP
stil/curl-easy
192575 Downloads
cURL wrapper for PHP. Supports parallel and non-blocking requests. For high speed crawling, see stil/curl-robot.
baba/sitemap-crawler
Downloads
smochin/instagram-php-crawler
6182 Downloads
A simple PHP Crawler for Instagram
nmure/crawler-detect-bundle
257553 Downloads
A Symfony bundle for the Crawler-Detect library (detects bots/crawlers/spiders via the user agent)
dachcom-digital/dynamic-search-data-provider-crawler
16308 Downloads
crawlbase/crawlbase
8073 Downloads
A lightweight, dependency free PHP class that acts as wrapper for Crawlbase API
vipnytt/useragentparser
753426 Downloads
User-Agent parser for robot rule sets
tomverran/robots-txt-checker
35613 Downloads
Given a robots.txt file, user agent and URL path will tell you whether you're allowed to access a page
spatie/http-status-check
47524 Downloads
CLI tool to crawl a website and check HTTP status code
sleeping-owl/apist
4584 Downloads
Package to provide api-like access to foreign sites based on html parsing
opensearchserver/opensearchserver
62754 Downloads
PHP library for OpenSearchServer: professionnal search engine, crawlers (web, file, database), REST APIs, .... This library uses OpenSearchServer's V2 API.
kiddyu/beanbun
4056 Downloads
Beanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性