Download the PHP package juanparati/phpscraper without Composer

On this page you can find all versions of the php package juanparati/phpscraper. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package phpscraper

PHPSCRAPER

1. What is it?

A command line tool used for extract and format content from webpages. It's suitable for extract like:

The output is formatted as JSON lines.

2. How it works?

  1. Create a scraper receipt (see recipes)
  2. Type:

    phpscraper config url

The following example will extract all the reviews with the user name, comment and rating from Amazon:

    phpscraper recipes/amazon.yml https://www.amazon.de/product-reviews/B000J34HN4/ref=acr_dpx_hist_3?ie=UTF8

For see additional options just type:

    phpscraper --help       

3. Recipes

Recipes are YML files that describe in a structure way how to extract the content from the pages. The recipes uses XPath routes in order to instruct which elements are extracted.

Example of recipe that extract comments from Amazon reviews:

    project: "Amazon reviews extractor"
    pagination:
      next_xpath: "//li[@class='a-last']/a/@href"
    extraction:
      product:
        xpath: "//h1/a[@class='a-link-normal']"
        extract_as: "product"
        in_memory: true
      comments:
        xpath: "//div[@class='a-section celwidget']"
        subelements:
          product:
            from_memory: "product"
            extract_as: "product"
          name:
            xpath: "//span[@class='a-profile-name']"
            extract_as: "name"
          rate:
            xpath: "//i[contains(@class, 'a-icon-star')]/span"
            extract_as: "rating"
            extract_regex: "/^.{0,3}/"
            cast_as: float
          comment:
            xpath: "//span[@class='a-size-base review-text review-text-content']"
            extract_as: "comment"
          verified:
            xpath: "//span[@class='a-size-mini a-color-state a-text-bold']"
            extract_as: "verified"
            cast_as: boolean

3.1 The pagination section

It defines where the "next page" button is located. In case that this element is not found then scraper then it will finish the process until the current page is extracted.

3.2 The extraction section

It defines which elements are going to be extracted. It uses a cascade structure so its possible to define the parent and child elements.

The possible instructions for the extraction section are:

5. Installation

PHPscraper can be installed in different ways:

A) Download the last build from Github or B) Just type "composer global require juanparati/phpscraper"

5. How to build my own package:


All versions of phpscraper with dependencies

PHP Build Version
Package Version
Requires ext-json Version *
amphp/amp Version ^2.3
league/climate Version ^3.5
symfony/dom-crawler Version ^4.3
symfony/yaml Version ^4.3
amphp/artax Version ^3.0
ext-dom Version *
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package juanparati/phpscraper contains the following files

Loading the files please wait ....