PHP download

Download the PHP package yurunsoft/crawler without Composer

On this page you can find all versions of the php package yurunsoft/crawler. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download yurunsoft/crawler
More information about yurunsoft/crawler
Files in yurunsoft/crawler

Vendor yurunsoft
Package crawler
Short Description 宇润爬虫框架(Yurun Crawler) 是一个低代码、高性能、分布式爬虫采集框架，这可能是最一把梭的爬虫框架。
License MIT

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of yurunsoft/crawler

Informations about the package crawler

yurun-crawler

介绍

宇润爬虫框架(Yurun Crawler) 是一个低代码、高性能、分布式爬虫采集框架，这可能是最一把梭的爬虫框架。

Yurun Crawler 基于 imi 框架开发，运行在 Swoole 常驻内存的协程环境。

为什么会开发这个框架？遇上有爬虫相关需求，调研了一些市面上现有的 PHP 爬虫框架，甚至是其它语言的爬虫框架，功能都十分简陋，需要编写的重复代码极多，不够一把梭。

开发手册：https://doc.yurunsoft.com/yurun-crawler/

目标

能够用最少的代码，方便快速地实现爬虫采集功能。

愿景

成为宇宙第一爬虫框架，以后提到爬虫就想到 Yurun Crawler 可以一把梭实现！

功能特性

低代码，几乎不需要编写代码，大部分逻辑依靠注解实现
高性能，基于 imi + Swoole 常驻内存及协程实现。即便只开一个下载器进程，也足以支撑大量的并发下载任务。
分布式，采集的流程由消息队列推动，依靠 Redis 等中间件实现纯天然的分布式特性
支持下载器并发限流
内置解析能力强，支持：Dom 解析、正则、JSON、Chrome Headless 页面渲染采集
代理 IP 池，支持：MySQL、Redis
支持定时采集
支持模型存储
方便扩展

示例

Demo Example: https://github.com/Yurunsoft/yurun-crawler-example

主要采集逻辑，可通过注解的方式来编写，超级简单：

概念

采集项目

有时候，我们会先采集列表页，再采集内容页

列表页、内容页，都是采集项目，他们的下载、解析、处理逻辑可能都不相同

下载器

负责请求网址，下载并存储内容。

多协程的架构下，支持同时下载海量数据。

支持限流。

数据模型

定义需要从页面中，提取的内容属性。

解析器

负责解析下载后的内容，从中提取需要的信息，返回数据模型。

支持：Dom 解析、正则、JSON、Chrome Headless 页面渲染采集。

处理器

解析器解析出数据模型后，交由处理器进行处理。

存储器

负责将解析后的数据存储入库，支持多种存储方式，并且可以自由扩展。

代理 IP 池

实现代理 IP 池抽象，开发者可以很方便地对接不同接口方。

联系我们

免费技术支持、交流群：17916227

商业合作 QQ：369124067

All versions of crawler with dependencies

PHP Build Version

Package Version

Version v1.1.0 Release 01. Sep 2020
create-project require 0 people chose require and
0 people chose create-project.

Download

Download latest version of crawler from vendor yurunsoft

Requires php Version >=7.1
ext-swoole Version >=4.4
yurunsoft/imi Version ~1.2
yurunsoft/yurun-http Version ^4.2
imiphp/imi-queue Version ~1.0
imiphp/imi-rate-limit Version ^1.0
phpdocumentor/reflection-docblock Version ^4.3
symfony/dom-crawler Version ^4.4
symfony/css-selector Version ^4.4
chrome-php/chrome Version ^0.8.1

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package yurunsoft/crawler contains the following files

Loading the files please wait ....