PHP download

Download the PHP package blogdaren/phpcreeper without Composer

On this page you can find all versions of the php package blogdaren/phpcreeper. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download blogdaren/phpcreeper
More information about blogdaren/phpcreeper
Files in blogdaren/phpcreeper

Vendor blogdaren
Package phpcreeper
Short Description A new generation of multi-process async event-driven spider engine based on Workerman
License MIT
Homepage http://www.phpcreeper.com

Keywords event-loop async headless spider crawler multi-process

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of blogdaren/phpcreeper

Informations about the package phpcreeper

PHPCreeper

[]() []() []() []() []()

What is it

PHPCreeper is a new generation of multi-process asynchronous event-driven spider engine based on workerman.

Focus on efficient agile development, and make the crawler job become more easy
Solve the performance and scalability bottlenecks of traditional crawler frameworks
Take advantage of crawling fully in Multi-Process + Distributed + Separated environment
Support headless browser which can execute JavaScript codes for crawling dynamic pages

爬山虎是基于workerman开发的全新一代多进程异步事件驱动型PHP爬虫引擎, 它有助于：

专注于高效敏捷开发，让爬取工作变得更加简单。
解决传统型PHP爬虫框架的性能和扩展瓶颈问题。
充分发挥多进程+分布式+分离式部署环境下的爬取优势。
支持无头浏览器即支持运行JavaScript代码及其渲染页。

Documentation

The chinese document is relatively complete, and the english document will be kept up-to-date constantly here.
注意： 爬山虎中文开发文档相对比较完善，各位小伙伴直接点击下方链接阅读即可.

爬山虎中文官方网站：http://www.phpcreeper.com
中文开发文档主节点：http://www.phpcreeper.com/docs/
爬山虎是一个免费开源的佛系爬虫项目，欢迎小星星Star支持，让更多的人发现、使用并受益。
爬山虎源码根目录下有一个Examples/start.php样例脚本，开发之前建议先阅读它而后运行它。
爬山虎提供的例子如果未能按照预期工作，请检查修改爬取规则，因为源站DOM极可能更新了。

技术交流

下方绿色二维码为微信技术交流群：phpcreeper【进群之前需先加此专属微信并备明来意或附上备注：爬山虎】
下方橙色二维码为作者的淘宝店铺，有需要的小伙伴可以购买支持作者的全套原创视频《深入PHP内核源码》
《深入PHP内核源码》原创视频的配套文档是作者一个字一个字随堂认真敲写而来，文字总数高达有近30000字，并且附有大量自绘原创插图，所以如果你是通过workerman社区联系到作者本人，计划在视频录制结束并二次完善文档之后的合适时间免费赠予有缘小伙伴，点此观看视频配套文档和完整目录章节视频。
微信群主要围绕爬山虎和 workerman 和深入PHP内核源码开展技术交流，观看PHP内核源码视频请移步至B站。

Screenshot

Features

Inherit almost all features from workerman
Support headless browser for crawling dynamic pages
Support router to different parser by the task type
Support Crontab-Jobs similar to Linux-Crontab
Support distributed and separated deployment
Support agile development with PHPCreeper-Application
Use PHPQuery as the elegant content extractor
With high performance and strong scalability

Prerequisites

PHP_VERSION ≥ 7.0.0 (Better to choose PHP 7.4+ for some compatibility reasons)
A POSIX compatible OS (Linux、Mac、BSD)
POSIX extension for PHP (Required)
PCNTL extension for PHP (Required)
REDIS extension for PHP (Optional, note that predis will be the default redis client since v1.4.2)
EVENT extension for PHP (Optional, it's strongly recommended to install for better performance)
简单的说：只要能跑起来workerman那就能跑起来PHPCreeper，所以安装要求和workerman完全一致。
POSIX扩展和PCNTL扩展是必选项：PHP发行包一般都会默认安装这两个扩展，若没有请自行编译安装。
EVENT扩展是可选项：建议最好安装，这是提升各路性能的一个主要支撑；另注意需要优化Linux内核。
REDIS扩展是可选项： 注意：v1.4.2版本之后，引擎默认采用predis客户端，所以不再强依赖REDIS扩展。

Installation

The recommended way to install PHPCreeper is through Composer.

Usage: NOT Depend On The PHPCreeper Application Framework

Firstly, there is another matched Application Framework named PHPCreeper-Application which is published simultaneously for your development convenience, although this framework is not necessary, we strongly recommend that you use it which will greatly improve your job efficiency. Besides, we can also write the code which NOT depends on the framework, it is also easy to make it.

Next let's take an example to show how to capture the weather in Washington in 7 days：(See Full Demo Here)

Now, save the example code above to a file and name it to be weather.php as a startup script, then run it like this:

Usage: Depend On The PHPCreeper Application Framework

If u wanna develop the app based on the PHPCreeper Application Framework or see more configuration，click here

How to set extractor rule

Per URL config item match a unique rule config item, and the rule_name must be one-to-one correspondence

Note the type of rule value must be Array

For a single task, the depth of the corresponding rule item, that is, the depth of the array, can only be 2

For multi task, the depth of the corresponding rule item, that is, the depth of the array, can only be 3

rule_name
you should give an unique rule name for each task, so that we can easily index the data that we want, if you leave it empty, it will use md5($task_id) not md5($task_url) which has potential pitfalls as the unique rule name since v1.6.0
selector
selector must be provided, or it will be ignored, just like jQuery selector, its value can be like #idName or .className or Html Element and so on.
action
default value is text, indicates what action we should take, the value can be one of the following:
text： used to get the inner text of html element
html： used to get the inner text with tags of html element
attr： used to get the attrbute value of html element
　　　　【Attention: the real value shoud be the attribute like src、href etc, NOT attr itself】
css ： used specially to get the style attribute of html element, and return as an array form
　　　　【*Attention: support also more variant format like `css:、css:prop1,prop2,...propN`**】
range
used to narrow down the entries to only those that match, just like jQuery selector, the value can be like #idName or .className or Html Element and so on.
callback
you can trigger a callback string or callback function here, but remember to return the data expected.

callback string: it is recommended to use and semantically equivalent to the PHP native callback function.
callback function: note that you should use callback string instead of ~~callback function~~, because PHP native callback function may work unexpectedly in communication across multi-process environments.

Use Database

PHPCreeper wrappers a lightweight database like Medoo style, please visit the Medoo official site if you wanna know more about its usage. now we just need to find out how to get the DBO, as a matter of fact, it is very simple:

First configure the database.php then add the code listed below:

Now we can get DBO and start the query or the other operation as you like:

Available commands

Note that all the commands in PHPCreeper can only run on the command line, and you must write a global entry startup script whose name assumed to be start.php before you start any crawling jobs, but if you use the PHPCreeper-Application framework for your development, it will automatically help you generate all the startup scripts including global we need.

LICENSE

PHPCreeper is released under the MIT License.

DISCLAIMER

Please DON'T use PHPCreeper for any businesses which are NOT PERMITTED BY LAW in your country.
Please comply with the spider protocol for friendly use of PHPCreeper, if you choose to use PHPCreeper,
you will comply with this agreement. I take no warranty or responsibility for this code. Use at your own risk.

All versions of phpcreeper with dependencies

PHP Build Version

Package Version

Version v2.0.3 Release 09. Apr 2025
create-project require 0 people chose require and
0 people chose create-project.

Download

Download latest version of phpcreeper from vendor blogdaren

Requires php Version >=7.0.0
workerman/workerman Version >=3.5.0,<5.0.0
blogdaren/configurator Version ^1.0
blogdaren/logger Version ^1.1.2
guzzlehttp/guzzle Version ^6.4 || ^7.0
predis/predis Version 2.0.*
chrome-php/chrome Version ^1.11

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package blogdaren/phpcreeper contains the following files

Loading the files please wait ....