PHP download

Download the PHP package spekulatius/phpscraper without Composer

On this page you can find all versions of the php package spekulatius/phpscraper. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download spekulatius/phpscraper
More information about spekulatius/phpscraper
Files in spekulatius/phpscraper

Vendor spekulatius
Package phpscraper
Short Description PHPScraper, built with simplicity in mind. See tests/ for more examples.
License GPL-3.0-or-later
Homepage https://phpscraper.de

Keywords PHP Library php crawler web scraping PHP scraper PHP scraping web-access xpath scraper

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of spekulatius/phpscraper

Informations about the package phpscraper

For full documentation, visit phpscraper.de.

PHPScraper is a versatile web-utility for PHP. Its primary objective is to streamline the process of extracting information from websites, allowing you to focus on accomplishing tasks without getting caught up in the complexities of selectors, data structure preparation, and conversion. Under the hood, it uses - [BrowserKit](https://symfony.com/doc/current/components/browser_kit.html) (formerly [Goutte](https://github.com/FriendsOfPHP/Goutte)) to access the web - [League/URI](https://github.com/thephpleague/uri) to process URLs - [donatello-za/rake-php-plus](https://github.com/donatello-za/rake-php-plus) to extract and analyze keywords See [composer.json](https://github.com/spekulatius/PHPScraper/blob/master/composer.json) for more details. :timer_clock: PHPScraper in 5 Minutes explained ----------------------------------------------- Here are a few impressions of the way the library works. More examples are on the [project website](https://phpscraper.de/examples/scrape-website-title.html). ### Basics: Flexible Calling as an Attribute or Method All scraping functionality can be accessed either as a function call or a property call. For example, the title can be accessed in two ways: ### :battery: Batteries included: Meta data, Links, Images, Headings, Content, Keywords, ... Many common use cases are covered already. You can find prepared extractors for various HTML tags, including interesting attributes. You can filter and combine these to your needs. In some cases there is an option to get a simple or detailed version, here in the case of `linksWithDetails`: If there aren't any matching elements (here links) on the page, an empty array will be returned. If a method normally returns a string it might return `null`. Details such as `follow_redirects`, etc. are optional configuration parameters (see below). Most of the DOM should be covered using these methods: - several [meta-tags](https://phpscraper.de/examples/scrape-meta-tags.html) and other [``-information](https://phpscraper.de/examples/scrape-header-tags.html) - [Social-Media information](https://phpscraper.de/examples/scrape-social-media-meta-tags.html) like Twitter Card and Facebook Open Graph - Content: [Headings](https://phpscraper.de/examples/headings.html), [Outline](https://phpscraper.de/examples/outline.html), [Texts](https://phpscraper.de/examples/paragraphs.html) and [Lists](https://phpscraper.de/examples/lists.html) - [Images](https://phpscraper.de/examples/scrape-images.html) - [Links](https://phpscraper.de/examples/scrape-links.html) - [Keywords](https://phpscraper.de/examples/extract-keywords.html) **A full list of methods with example code can be found on [phpscraper.de](https://phpscraper.de). Further examples are in the [tests](https://github.com/spekulatius/PHPScraper/tree/master/tests).** ### Download Files Besides processing the content on the page itself, you can download files using `fetchAsset`: You will only need to write the content into a file or cloud storage. ### Process the RSS feeds, `sitemap.xml`, etc. PHPScraper can assist in collecting feeds such as [RSS feeds, `sitemap.xml`-entries and static search indexes](https://phpscraper.de/examples/scrape-feeds.html). This can be useful when deciding on the next page to crawl or building up a list of pages on a website. Here we are processing the sitemap into a set of [`FeedEntry`-DTOs](https://github.com/spekulatius/PHPScraper/blob/master/src/DataTransferObjects/FeedEntry.php): Whenever post-processing is applied, you can fall back to the underlying `*Raw`-methods. ### Process CSV-, XML- and JSON files and URLs PHPScraper comes out of the box with file / URL processing methods for CSV-, XML- and JSON: - `parseJson` - `parseXml` - `parseCsv` - `parseCsvWithHeader` (generates an asso. array using the first row) Each method can process both strings as well as URLs: Additional CSV parsing parameters such as separator, enclosure and escape are possible. ### There is more! There are plenty of examples on the [PHPScraper website](https://phpscraper.de) and in the [tests](https://github.com/spekulatius/PHPScraper/tree/master/tests). Check the [`playground.php`](https://github.com/spekulatius/PHPScraper/blob/master/playground.php) if you prefer learning by doing. You get it up and running with: :muscle: Roadmap ---------------- The future development is organized into [milestones](https://github.com/spekulatius/PHPScraper/milestones?direction=asc&sort=title). Releases follow [semver](https://semver.org/). ### v1: [Building the first stable version](https://github.com/spekulatius/PHPScraper/milestone/4?closed=1) - Improve documentation and examples. - Organize code better (move websites into separate repos, etc.) - Add support for feeds and some typical file types. ### v2: Service Upgrade: - Switch from Goutte to [Symfony BrowserKit](https://symfony.com/doc/current/components/browser_kit.html). Goutte has been archived. ### v3: [Expand the functionality and cover more 'types'](https://github.com/spekulatius/PHPScraper/milestone/5) - Expand to parse a wider range of types, elements, embeds, etc. - Improve performance with caching and concurrent fetching of assets - Minor improvements for parsing methods ### v4: [Expand to provide more guidance on building custom scrapers on top of PHPScraper](https://github.com/spekulatius/PHPScraper/milestone/6) TBC. :heart_eyes: Sponsors --------------------- PHPScraper is sponsored by:

With your support, PHPScraper can became the *PHP swiss army knife for the web*. If you find PHPScraper useful to your work, please consider a [sponsorship](https://github.com/sponsors/spekulatius) or [donation](https://www.buymeacoffee.com/spekulatius). Thank you :muscle: :gear: Configuration (optional) ------------------------------- If needed, you can use the following configuration options: ### User Agent You can set the browser agent using `setConfig`: It defaults to `Mozilla/5.0 (compatible; PHP Scraper/1.x; +https://phpscraper.de)`. ### Proxy Support You can configure proxy support with `setConfig`: ### Timeout You can set the `timeout` using `setConfig`: Setting the timeout to zero will disable it. ### Disabling SSL While unrecommended, it might be required to disable SSL checks. You can do so using: You can call `setConfig` multiple times. It stores the config and merges it with previous settings. This should be kept in mind in the unlikely use-case when unsetting values. :rocket: Installation with Composer ----------------------------------- After the installation, the package will be picked up by the Composer autoloader. If you are using a common PHP application or framework such as Laravel or Symfony you can start scraping now :rocket: If not or you are building a standalone-scraper, please include the autoloader in `vendor/` at the top of your file: Now you can now use any of the examples on the documentation website or from the [`tests/`-folder](https://github.com/spekulatius/PHPScraper/tree/master/tests). Please consider supporting PHPScraper with a star or [sponsorship](https://github.com/sponsors/spekulatius): Thank you :muscle: :white_check_mark: Testing -------------------------- The library comes with a PHPUnit test suite. To run the tests, run the following command from the project folder: You can find the tests [here](https://github.com/spekulatius/PHPScraper/tree/master/tests). The test pages are [publicly available](https://github.com/spekulatius/phpscraper-test-pages). ## MISC: [Issues](https://github.com/spekulatius/PHPScraper/issues), [Ideas](https://github.com/spekulatius/PHPScraper/milestones), [Contributing](https://github.com/spekulatius/PHPScraper/blob/master/CONTRIBUTING.md), [CHANGELOG](https://github.com/spekulatius/PHPScraper/blob/master/CHANGELOG.md), [UPGRADING](https://github.com/spekulatius/PHPScraper/blob/master/UPGRADING.md), [LICENSE](https://github.com/spekulatius/PHPScraper/blob/master/LICENSE.md)

All versions of phpscraper with dependencies

PHP Build Version

Package Version

Version 3.0.0 Release 09. Apr 2024
create-project require 17 people chose require and
0 people chose create-project.

Download

Download latest version of phpscraper from vendor spekulatius

Requires php Version ^8.1
ext-intl Version *
symfony/dom-crawler Version ^5.4 || ^6.0 || ^7.0
donatello-za/rake-php-plus Version ^1.0.15
league/uri Version ^7.0
symfony/browser-kit Version ^6.0 || ^7.0
symfony/http-client Version ^6.0 || ^7.0
symfony/css-selector Version ^6.0 || ^7.0

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package spekulatius/phpscraper contains the following files

Loading the files please wait ....

Download the PHP package spekulatius/phpscraper without Composer

FAQ

How can I use the PHP package after the download?

Do I need to create a project on this site?

When is it necessary to insert some auth.json content?

What is the advantage to use this site for my Composer projects?

Informations about the package phpscraper

All versions of phpscraper with dependencies

Version 3.0.0 Release 09. Apr 2024 create-project require 17 people chose require and0 people chose create-project. Add to Project Download

The package spekulatius/phpscraper contains the following files

Version 3.0.0 Release 09. Apr 2024
create-project require 17 people chose require and
0 people chose create-project.

Download