Download the PHP package giauphan/goutte without Composer
On this page you can find all versions of the php package giauphan/goutte. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download giauphan/goutte
More information about giauphan/goutte
Files in giauphan/goutte
Package goutte
Short Description A simple PHP Web Scraper
License MIT
Homepage https://github.com/Giauphan/Goutte
Informations about the package goutte
Goutte, a simple PHP Web Scraper
Goutte is a screen scraping and web crawling library for PHP.
Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses.
WARNING: This library is deprecated. As of v4, Goutte became a simple proxy
to the HttpBrowser class
<https://symfony.com/doc/current/components/browser_kit.html#making-external-http-requests>
from the Symfony BrowserKit component. To
migrate, replace Goutte\Client
by
Symfony\Component\BrowserKit\HttpBrowser
in your code.
Requirements
Goutte depends on PHP 7.1+.
Installation
Add giauphan/goutte
as a require dependency in your composer.json
file:
composer require giauphan/goutte
Usage
Create a Goutte Client instance (which extends
Symfony\Component\BrowserKit\HttpBrowser
):
use Goutte\Client;
$client = new Client();
Make requests with the request()
method:
// Go to the symfony.com website
$crawler = $client->request('GET', 'https://www.symfony.com/blog/');
The method returns a Crawler
object
(Symfony\Component\DomCrawler\Crawler
).
To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout:
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;
$client = new Client(HttpClient::create(['timeout' => 60]));
Click on links:
// Click on the "Security Advisories" link
$link = $crawler->selectLink('Security Advisories')->link();
$crawler = $client->click($link);
Extract data:
// Get the latest post in this category and display the titles
$crawler->filter('h2 > a')->each(function ($node) {
print $node->text()."\n";
});
Submit forms:
$crawler = $client->request('GET', 'https://github.com/');
$crawler = $client->click($crawler->selectLink('Sign in')->link());
$form = $crawler->selectButton('Sign in')->form();
$crawler = $client->submit($form, ['login' => 'giauphan', 'password' => 'xxxxxx']);
$crawler->filter('.flash-error')->each(function ($node) {
print $node->text()."\n";
});
More Information
Read the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte.
Pronunciation
Goutte is pronounced goot
i.e. it rhymes with boot
and not out
.
Technical Information
Goutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient.
License
Goutte is licensed under the MIT license.
All versions of goutte with dependencies
symfony/deprecation-contracts Version ^3.4
symfony/browser-kit Version *
symfony/css-selector Version *
symfony/dom-crawler Version *
symfony/http-client Version *
symfony/mime Version *