Download the PHP package nelexa/roach-php-bundle without Composer
On this page you can find all versions of the php package nelexa/roach-php-bundle. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download nelexa/roach-php-bundle
More information about nelexa/roach-php-bundle
Files in nelexa/roach-php-bundle
Package roach-php-bundle
Short Description Symfony bundle for roach-php/core
License MIT
Informations about the package roach-php-bundle
roach-php-bundle
Symfony bundle for Roach PHP.
Roach is a complete web scraping toolkit for PHP. It is
a shameless cloneheavily inspired by the popular Scrapy package for Python.
The Symfony bundle mostly provides the necessary container bindings for the various services Roach uses, as well as making certain configuration options available via a config file. To learn about how to actually start using Roach itself, check out the rest of the documentation.
Installing the Symfony bundle
Add nelexa/roach-php-bundle
to your composer.json file:
Versions & Dependencies
Bundle version | roach-php/core version | Symfony version | PHP version(s) |
---|---|---|---|
0.3.0 | 0.3.0 | ^5.3 | ^6.0 | >= 8.0 |
1.0.0 | ^1.0.0 | ^6.0 | >= 8.0 |
1.1.0 | 1.1.* | ^6.0 | >= 8.0 |
Register the bundle:
Register bundle into config/bundles.php (Flex did it automatically):
Available Commands
The Symfony bundle of Roach registers a few console commands to make out development experience as pleasant as possible.
Run spider
After that, you will get the entire list of available spiders.
Simply select the desired spider (▼ or ▲) or enter its number and press Enter.
You can pass as the first argument the name spider class name to run or its alias.
For example, if you have a class App\Spider\GoogleSpider
, then you can pass the following aliases: GoogleSpider
, google_spider
or google
.
Sometimes it is useful to override the number of concurrent requests and the pre-request delay. To do this, you can pass the --concurrency
and --delay
options.
These options override the $concurrency
and $requestDelay
public properties of your spider.
Add the --output
(-o
) option and you can save the collected data to a JSON file.
Starting the REPL
Roach ships with an interactive shell (often called Read-Evaluate-Print-Loop, or Repl for short) which makes prototyping our spiders a breeze. We can use the provided roach:shell
command to launch a new Repl session.
Generator classes
First install Symfony MakerBundle
.
Create a new roach spider class
Create a new roach extension class
Create a new roach item processor class
Create a new roach downloader request middleware class
Create a new roach downloader response middleware class
Create a new roach spider item middleware class
Create a new roach spider request middleware class
Create a new roach spider response middleware class
Screencast
Credits
Changelog
Changes are documented in the releases page.
License
The MIT License (MIT). Please see LICENSE for more information.
All versions of roach-php-bundle with dependencies
roach-php/core Version ~1.1.0
symfony/config Version ^6.0
symfony/dependency-injection Version ^6.0
symfony/http-kernel Version ^6.0
symfony/console Version ^6.0
symfony/serializer Version ^6.0