Download the PHP package malahierba-lab/web-harvester without Composer

On this page you can find all versions of the php package malahierba-lab/web-harvester. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package web-harvester

Laravel Web Harvester

A tool for get information from external websites. Powered by PhantomJS and malahierba.cl dev team

Installation

Add in your composer.json:

{
    "require": {
        "malahierba-lab/web-harvester": "1.*"
    }
}

Then you need run the composer update command.

After install you must configure Service Provider. Simply add the service provider in the config/app.php providers section:

Malahierba\WebHarvester\WebHarvesterServiceProvider::class

Now you need publish the config file. Simply execute php artisan vendor:publish

Configuration

Laravel Web Harvester run using PhantomJS headless Webkit browser. This tool is included as binary, so before you can use this package you need to specify your OS. This can be done in config file config\webharvester.php.

You need set option environment with once of the options supported:

example: 'environment' => 'macosx'

Use

Important: For documentation purposes, in the examples below, always we assume than you import the library into your namespace using use Malahierba\WebHarvester;

Get WebPage Components

$url = 'http://someurl';
$webharvester = new WebHarvester;

//Check if we can process the URL and Load it
if ($webharvester->load($url)) {

    //Page Title
    $title                   = $webharvester->getTitle();

    //Page Description
    $description             = $webharvester->getDescription();

    //Get Status Code (If the url redirect to another webpage, then return the status code for the final webpage)
    $status_code             = $webharvester->getStatusCode();

    //Page Featured Image as URL
    $featured_image_url      = $webharvester->getFeaturedImage();

    //Page Featured Image as Base64
    $featured_image_base_64  = $webharvester->getFeaturedImage('base64');

    //Page real URL (if the $url redirect to another, return the final)
    $real_url                = $webharvester->getRealURL();

    //Site Name
    $sitename                = $webharvester->getSiteName();
}

Get expected behavior of the Robot (based on meta name="robots")

$url = 'http://someurl';
$webharvester = new WebHarvester;

//Check if we can process the URL and Load it
if ($webharvester->load($url)) {

    //check for index
    if ($webharvester->isIndexable()) {

        //...some code

    }

    //check for follow
    if ($webharvester->isFollowable()) {

        //...some code

    }

}

Get found links in WebPage (useful for web crawlers, web spiders, etc.)

$url = 'http://someurl';
$webharvester = new WebHarvester;

//Check if we can process the URL and Load it
if ($webharvester->load($url)) {

    //all full links as array

    $links = $webharvester->getLinks();  //retrieve an array with found links

    //all links as array, but query component removed (from the character "?" onwards)

    $links = $webharvester->getLinks([
        'remove' => ['query']
    ]);

    //retrieve links as array of objects (properties: url, follow)
    //if follow is false indicate than that links is marked to no follow (rel='nofollow') by the source website

    $links = $webharvester->getLinks(['only_urls' => false]); //default true

}

Important: For security reasons all links with embeded javascript are not included in output array

Get WebPage Raw Content

$url = 'http://someurl';
$webharvester = new WebHarvester;

//Check if we can process the URL and Load it
if ($webharvester->load($url)) {
    $raw = $webharvester->content();
}

Take ScreenShoot of a WebPage

$url = 'http://someurl';
$webharvester = new WebHarvester;

//Check if we can process the URL and Load it
if ($webharvester->takeScreenshot($url)) {
    $image_base_64 = $webharvester->content();  //return a base64 string
}

Setup Options

You can customize the webharvester with some functions:

$webharvester = new WebHarvester;

//Custom User Agent
$webharvester->setUserAgent('your user agent');

//Ignore SSL Errors
$webharvester->setIgnoreSSLErrors(true);

//Resource Timeout (in milliseconds)
$webharvester->setResourceTimeout(3000);

//Wait after load (in milliseconds)
$webharvester->setWaitAfterLoad(3000);  // <- useful for get async content

Licence

This project has MIT licence. For more information please read LICENCE file.


All versions of web-harvester with dependencies

PHP Build Version
Package Version
Requires php Version >=5.5.18
laravel/framework Version 5.*
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package malahierba-lab/web-harvester contains the following files

Loading the files please wait ....