Download the PHP package edulazaro/larascraper without Composer
On this page you can find all versions of the php package edulazaro/larascraper. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package larascraper
Larascraper - A Simple Scraper for Laravel
Introduction
Larascrape allows you to scrape any URL using Laravel. It uses Puppeteer under the hood. Unlikely Sapatie Crawler or Browsershot, this scraper focuses on simplicity. While Spatie Crawler can leave opened many Chromium instances, filling your server memory, Larascrape starts the scraping process using Node, making sure the Chromium instance is closed before existint.
Unlikely Spatie Crawler, it supports Proxy authentication and in general is faster.
Install
Run this command via Composer:
Then install the required Node dependencies:
These packages are required for the internal Puppeteer script to run.
Please note that when you run the scraper via a scheduled task, chances are a non interactive terminal is used. Usually Node will be available, but it may not be the case when installing Node via NVM. In this scenario, check the issues section at the end.
Basic Usage
Create a scraper class (manually or via the built-in command):
This generates a file like:
You can now scrape a URL like this:
You can pass parameters to the run method as long as they are handled:
And then you can do:
Proxy Support
Larascraper supports proxies with or without authentication:
Or if using authentication:
Timeout
To add a custom timeout (20000 ms by default):
Headers
To append custom headers:
Retry logic
You can add the number of attempts and the number of seconds to wait between attempts:
Retry 3 times and wait 5 seconds betwee attempts. Please note only the error codes 408, 429, 500, 502, 503 and 504 will be retried.
Artisan Commands
You can generate a scraper instance with:
List all scrapers in app/Scrapers directory:
Testing a scraper
You can easily test a scraper with Tinker:
And the running:
Issues
This section contains common configuration issues.
Using Node via NVM
If you use Node via NVM and you try to run the scraper via a scheduled task, chances are Node is not available. To make it available, edit your bash_profile with an editor like Vi, Vim or Nano:
Then make sure this is included at the top:
Save the file and run:
Now Node will be available for non interative terminals and the scraping process should run successfully.
In general, it's not recommended the usage of NVM on production environments.