Download the PHP package octopoda/octopus without Composer
On this page you can find all versions of the php package octopoda/octopus. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package octopus
Octopus Sitemap Crawler
Small PHP tool to crawl collections of URLs in a Sitemap using the PHPReact library for asynchronous loading of the URLs. Both plain text files and XML Sitemaps are supported.
Usage from the Command Line Interface (CLI)
Crawl the URLs in a Sitemap with verbose logging (-vvv
).
Using 15 concurrent connections instead of the default 5 concurrent connections:
Use a HTTP GET
request instead of the default HTTP HEAD
. Note that HTTP HEAD
requests involve less data transfer since no body is involved:
Use a timeout of 3 seconds instead of the default 10 seconds:
Use a specific UserAgent instead of the default Octopus/1.0
, for example, to simulate a search engine crawling a sitemap:
Use the TablePresenter
to display intermediate results instead of the default EchoPresenter
:
Usage from your own application
You can easily integrate sitemap crawling in your own application, have a look at the Config
class for all possible configuration options. If required you can use a PSR3-Logger for logging purposes.
Limitations
Currently, Octopus is mainly an experimental / educational tool. Advanced use cases in HTTP response handling might not be supported.
Tests
To run the test suite, you first need to clone this repository and then install all dependencies using Composer:
To run the test suite, go to the project root and run:
All versions of octopus with dependencies
ext-simplexml Version *
clue/reactphp-flux Version ^1.4
psr/log Version ^2.0 || ^3.0
react/filesystem Version ^0.1
react/http Version ^1.9
symfony/console Version ^6.0 || ^7.0
teapot/status-code Version ^2.2