Download the PHP package zrashwani/arachnid without Composer

On this page you can find all versions of the php package zrashwani/arachnid. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package arachnid

Arachnid Web Crawler

This library will crawl all unique internal links found on a given website up to a specified maximum page depth.

This library is using symfony/panther & FriendsOfPHP/Goutte libraries to scrap site pages and extract main SEO-related info, including: title, h1 elements, h2 elements, statusCode, contentType, meta description, meta keyword and canonicalLink.

This library is based on the original blog post by Zeid Rashwani here:

http://zrashwani.com/simple-web-spider-php-goutte

Josh Lockhart adapted the original blog post's code (with permission) for Composer and Packagist and updated the syntax to conform with the PSR-2 coding standard.

Build Status codecov

Sponsored By

How to Install

You can install this library with Composer. Drop this into your composer.json manifest file:

{
    "require": {
        "zrashwani/arachnid": "dev-master"
    }
}

Then run composer install.

Getting Started

Basic Usage:

Here's a quick demo to crawl a website:

Enabling Headless Browser mode:

Headless browser mode can be enabled, so it will use Chrome engine in the background which is useful to get contents of JavaScript-based sites.

enableHeadlessBrowserMode method set the scraping adapter used to be PantherChromeAdapter which is based on Symfony Panther library:

In order to use this, you need to have chrome-driver installed on your machine, you can use dbrekelmans/browser-driver-installer to install chromedriver locally:

Advanced Usage:

Set additional options to underlying http client, by specifying array of options in constructor or creating Http client scrapper with desired options:

You can inject a PSR-3 compliant logger object to monitor crawler activity (like Monolog):

You can set crawler to visit only pages with specific criteria by specifying callback closure using filterLinks method:

You can use LinksCollection class to get simple statistics about the links, as following:

How to Contribute

  1. Fork this repository
  2. Create a new branch for each feature or improvement
  3. Apply your code changes along with corresponding unit test
  4. Send a pull request from each feature branch

It is very important to separate new features or improvements into separate feature branches, and to send a pull request for each branch. This allows me to review and pull in new features or improvements individually.

All pull requests must adhere to the PSR-2 standard.

System Requirements

Authors

License

MIT Public License


All versions of arachnid with dependencies

PHP Build Version
Package Version
Requires php Version >=7.2.0
ext-spl Version *
tightenco/collect Version ^v8.34
guzzlehttp/psr7 Version ^1.4
symfony/panther Version ^1.0
fabpot/goutte Version ^4.0
psr/log Version ^1.1
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package zrashwani/arachnid contains the following files

Loading the files please wait ....