Download the PHP package wikimedia/deadlinkchecker without Composer
On this page you can find all versions of the php package wikimedia/deadlinkchecker. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download wikimedia/deadlinkchecker
More information about wikimedia/deadlinkchecker
Files in wikimedia/deadlinkchecker
Package deadlinkchecker
Short Description Library for checking if a given link is dead or alive
License GPL-3.0-or-later
Homepage https://www.mediawiki.org/wiki/DeadlinkChecker
Informations about the package deadlinkchecker
DeadlinkChecker
Maintainer: Cyberpower678
REQUIRES: PHP 7.3 or higher
This is a PHP library for detecting whether URLs on the internet are alive or dead via cURL. It includes the following features:
- Supports HTTP, HTTPS, FTP, MMS, and RTSP URLs
- Supports TOR
- Supports internationalized domain names
- Basic detection for soft 404s
- For optimized performance, it initially performs a header-only page request (CURLOPT_NOBODY). If that request fails, it then tries to do a normal full body page request.
- Concurrently checks batch of URLs for efficiency
Overview
The checkIfDead library is a PHP library designed for assessing the status of URLs on the web and dark web. It operates by taking one or more URLs as inputs and concurrently checks them, to enhance response times.
It can handle both properly and improperly formatted URLs and performs basic sanity checking and error correction on malformed inputs. All inputs are normalized through the sanitizer to ensure the curl library communicates properly with the target.
When left at defaults, the library will emulate a web browser request and follow redirects to its destination.
Installation
Using composer: Add the following to the composer.json file for your project:
And then run 'composer update'.
Or using git:
Basic Usage
For checking a single link:
Prints:
For checking an array of links:
Prints:
Note that these functions will return null
if they are unable to determine whether a link is alive or dead.
Advanced Usage
You can control how long it takes before page requests timeout by passing parameters to the constructor. To set the header-only page requests to a 10 second timeout and the full page requests to a 20 second timeout, you would use the following:
In addition to controlling query timeouts, a custom user agent can be passed to the library as well like so:
By default, multiple URLs of the same domain are queued sequentially to be respectul to the hosts. However, this can be disabled so all URLs are queried concurrently as follows:
You can increase the verbosity of the output to follow what the library is doing as it's doing it.
Finally, because the library supports TOR requests, the environment will need a working SOCKS5 proxy to make the requests. The library looks for the SOCKS5 proxy using system defaults, but the proxy can be specified manually.
Getting details about the last batch of URLs checked
After a batch of URLs have been checked, you can use $deadLinkChecker->getErrors()
to get the curl errors encountered during the process, and $deadLinkChecker->getRequestDetails()
to get the curl request details of all URLs checked in the last batch.
Other functions
To clean up dirty URLs and allow them to be normalized to correctly line with varying HTTP clients:
By default, $stripFragment is false. When set to true, URL fragments are dropped.
Because PHP has a tendency to fail parsing URLs containing UTF-8 characters, you can use the library's parseURL method.
License
This code is distributed under GNU GPLv3+
All versions of deadlinkchecker with dependencies
ext-intl Version *
lib-curl Version >=7.43.0
php Version >=7.3.0