Libraries tagged by scrawl
vipnytt/useragentparser
777702 Downloads
User-Agent parser for robot rule sets
tomverran/robots-txt-checker
39481 Downloads
Given a robots.txt file, user agent and URL path will tell you whether you're allowed to access a page
spatie/http-status-check
47601 Downloads
CLI tool to crawl a website and check HTTP status code
sleeping-owl/apist
4712 Downloads
Package to provide api-like access to foreign sites based on html parsing
opensearchserver/opensearchserver
63273 Downloads
PHP library for OpenSearchServer: professionnal search engine, crawlers (web, file, database), REST APIs, .... This library uses OpenSearchServer's V2 API.
luka-dev/headless-task-server-php
5730 Downloads
Helper for sending requests to luka-dev/headless-task-server
kiddyu/beanbun
4065 Downloads
Beanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性
eddieace/php-simple
44580 Downloads
cyber-duck/silverstripe-seo
48603 Downloads
A SilverStripe module to optimise the Meta, crawling, indexing, and sharing of your website content
crwlr/robots-txt
10459 Downloads
Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping
spatie/laravel-link-checker
52509 Downloads
Check all links in a Laravel app
jyggen/curl
143474 Downloads
A simple and lightweight cURL library with support for asynchronous requests.
gsouf/chromium
511 Downloads
Instrument headless chrome/chromium instances from PHP
schliesser/sitecrawler
23176 Downloads
TYPO3 sitemap crawler
jaeger/querylist-puppeteer
67183 Downloads
QueryList Plugin: Use Puppeteer to crawl Javascript dynamically rendered pages.(Headless Chrome ) 使用Puppeteer采集JavaScript动态渲染的页面