Download the PHP package edulazaro/urlnormalizer without Composer
On this page you can find all versions of the php package edulazaro/urlnormalizer. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download edulazaro/urlnormalizer
More information about edulazaro/urlnormalizer
Files in edulazaro/urlnormalizer
Package urlnormalizer
Short Description A package to normalize URLs and filter URLs
License
Informations about the package urlnormalizer
URL Normalizer
Introduction
URL nnormalizer allows to normalize a URL. A normalized URL refers to the format of a URL that has been standardized according to a set of rules. The purpose of URL normalization (or URL canonicalization) is to transform a URL into a normalized or canonical form. This way, URLs that are essentially identical but represented differently are considered equal by web servers or applications.
Normalization helps in reducing duplication of URLs where multiple URLs point to the same content. For instance, the following URLs might point to the same content but are represented differently:
For example, the URLs http://edulazaro.com?a=1&b=2
and http://edulazaro.com?b=2&a=1
are the same, and can be normalized to http://edulazaro.com?a=1&b=2
.
The same happens with the URLs http://edulazaro.com/
and http://edulazaro.com
, where the forward slash is indifferent.
The same happens with dot segments like /../
or /./
and with the encoded unreserved characters like %61
, which can be represented as an a
.
Installation
To install the package just execute this command:
Usage
Just import the class URLNormalizer
can use the normalize
method:
You can also get the top domain of a URL by using the getURLTopLevelDomain
method:
The class URLCounter
is also included so you can count the number of unique normalized URLs in an array:
Or you can also count them per top level domain:
Testing
To test the package run composer test
.