Download the PHP package layered/url-preview without Composer
On this page you can find all versions of the php package layered/url-preview. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download layered/url-preview
More information about layered/url-preview
Files in layered/url-preview
Package url-preview
Short Description Get detailed info for any URL on the internet! Scraper for HTML, OpenGraph, Schema data
License MIT
Informations about the package url-preview
Page Meta 🕵
Page Meta is a PHP library than can retrieve detailed info on any URL from the internet! It uses data from HTML meta tags and OpenGraph with fallback to detailed HTML scraping.
Highlights
- Works for any valid URL on the internet!
- Follows page redirects
- Uses all scraping methods available: HTML tags, OpenGraph, Schema data
Potential use cases
- Display Info Cards for links in a article
- Rich preview for links in messaging apps
- Extract info from a user-submitted URL

How to use
Installation
Add layered/page-meta as a dependency in your project's composer.json file:
Usage
Create a UrlPreview instance, then call loadUrl($url) method with your URL as first argument. Preview data is retrieved with get($section) or getAll() methods:
Behind the scenes
The library downloads the HTML source of the url you provided, then uses specialized scrapers to extract pieces of information.
Core scrapers can be seen in src/scrapers/, and they extract general info for a page: title, author, description, page type, main image, etc.
If you would like to extract a new field, see Extending the library section.
User Agent or extra headers can make a big difference when downloading HTML from a website. There are some websites that forbid scraping and hide the content when they detect a tool like this one. Make sure to read their dev docs & TOS.
The default User Agent is blocked on sites like Twitter, Instagram, Facebook and others. A workaround is to use this one (thanks for the tip PVGrad):
'HTTP_USER_AGENT' => 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'
Returned data
Returned data will be an Array with following format:
See UrlPreview::getAll() for info on each returned field.
Public API
UrlPreview class provides the following public methods:
__construct(array $headers): UrlPreview
Start the UrlPreview instance. Pass extra headers to send when requesting the page URL
loadUrl(string $url): UrlPreview
Load and start the scrape process for any valid URL
getAll(): array
Get all data scraped from page
Return: Array with scraped data in following format:
site- info about the websiteurl- main site URLname- site name, ex: 'Instagram' or 'Medium'secure- Boolean true|false depending on http connectionresponsive- Boolean true|false.Trueif site hasviewportmeta tag present. Basic check for responsivenessicon- site iconlanguage- ISO 639-1 language code, ex:en,es
page- info about the page at current URLtype- page type, ex:website,article,profile,video, etcurl- canonical URL for the pagetitle- page titledescription- page descriptionimage-Arraycontaining image info, if present:url- image URLwidth- image widthheight- image widthvideo-Arraycontaining video info, if found on page:url- video URLwidth- video widthheight- video width
author- info about the content author, ex:name- Author's name on a blog, person's name on social network siteshandle- Social media site usernameurl- Author URL for more articles or Profile URL on social network sites
app_links-Arraycontaining apps linked to page, like:ios- iOS appurl- link for in-app action, ex: 'nflx://www.netflix.com/title/80014749'app_store_id- Apple AppStore app IDapp_name- name of the appstore_url- link to installable appandroid- Android appurl- link for in-app action, ex: 'nflx://www.netflix.com/title/80014749'package- Android PlayStore app IDapp_name- name of the appstore_url- link to installable app
get(string $section): array
Get data in one scraped section site, page, profile or app_links
Return: Array with section scraped data. See UrlPreview::getAll() for data format
addListener(string $eventName, callable $listener, int $priority = 0): UrlPreview
Attach an event on UrlPreview for data processing or scrape process. Arguments:
$eventName- on which event to listen. Available:page.scrape- fired when the scraping process startsdata.filter- fired when data is requested bygetData()orgetAll()methods
$listener- a callable reference, which will get the$eventparameter with available data$priority- order on which the callable should be executed
Extending the library
If there's need to more scraped data for a URL, more functionality can be attached to PageMeta library. Example for returing the 'Terms and Conditions' link from pages:
More
Please report any issues here on GitHub.
Any contributions are welcome