Libraries tagged by crawl
monperrus/crawler-user-agents
2033 Downloads
This repository contains a list of of HTTP user-agents used by robots, crawlers, and spiders as in single JSON file.
crawlbase/crawlbase
29016 Downloads
A lightweight, dependency free PHP class that acts as wrapper for Crawlbase API
vipnytt/robotstxtparser
683860 Downloads
Robots.txt parsing library, with full support for every directive and specification.
stil/curl-easy
200193 Downloads
cURL wrapper for PHP. Supports parallel and non-blocking requests. For high speed crawling, see stil/curl-robot.
baba/sitemap-crawler
Downloads
smochin/instagram-php-crawler
8121 Downloads
A simple PHP Crawler for Instagram
nmure/crawler-detect-bundle
269557 Downloads
A Symfony bundle for the Crawler-Detect library (detects bots/crawlers/spiders via the user agent)
friends-of-hyva/magento2-crawler-session
3320 Downloads
Prevent crawlers from creating a session
dachcom-digital/dynamic-search-data-provider-crawler
24421 Downloads
aoepeople/crawler
287391 Downloads
Crawler extension for TYPO3
luka-dev/headless-task-server-php
8158 Downloads
Helper for sending requests to luka-dev/headless-task-server
vipnytt/useragentparser
858194 Downloads
User-Agent parser for robot rule sets
tomverran/robots-txt-checker
51635 Downloads
Given a robots.txt file, user agent and URL path will tell you whether you're allowed to access a page
spatie/http-status-check
47740 Downloads
CLI tool to crawl a website and check HTTP status code
sleeping-owl/apist
4951 Downloads
Package to provide api-like access to foreign sites based on html parsing