Download the PHP package onnov/detect-encoding without Composer
On this page you can find all versions of the php package onnov/detect-encoding. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download onnov/detect-encoding
More information about onnov/detect-encoding
Files in onnov/detect-encoding
Package detect-encoding
Short Description Text encoding definition class instead of mb_detect_encoding. Defines: utf-8, windows-1251, koi8-r, iso-8859-5, ibm866, .....
License MIT
Homepage https://github.com/onnov/detect-encoding
Informations about the package detect-encoding
Detect encoding
Text encoding definition class based on a range of code page character numbers.
So far, in PHP v7.* the mb_detect_encoding
function does not work well.
Therefore, you have to somehow solve this problem.
This class is one solution.
Built-in encodings and accuracy:
letters -> | 5 | 15 | 30 | 60 | 120 | 180 | 270 |
---|---|---|---|---|---|---|---|
windows-1251 | 99.13 | 98.83 | 98.54 | 99.04 | 99.73 | 99.93 | 100.0 |
koi8-r | 99.89 | 99.98 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
iso-8859-5 | 81.79 | 99.27 | 99.98 | 100.0 | 100.0 | 100.0 | 100.0 |
ibm866 | 99.81 | 99.99 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
MacCyrillic | 12.79 | 47.49 | 73.48 | 92.15 | 99.30 | 99.94 | 100.0 |
Worst accuracy with MacCyrillic, you need at least 60 characters to determine this encoding with an accuracy of 92.15%. Windows-1251 encoding also has very poor accuracy. This is because the numbers of their characters in the tables overlap very much.
Fortunately, MacCyrillic and ibm866 encodings are not used to encode web pages. By default, they are disabled in the script, but you can enable them if necessary.
letters -> | 5 | 10 | 15 | 30 | 60 |
---|---|---|---|---|---|
windows-1251 | 99.40 | 99.69 | 99.86 | 99.97 | 100.0 |
koi8-r | 99.89 | 99.98 | 99.98 | 100.0 | 100.0 |
iso-8859-5 | 81.79 | 96.41 | 99.27 | 99.98 | 100.0 |
The accuracy of the determination is high even in short sentences from 5 to 10 letters. And for phrases from 60 letters, the accuracy of determination reaches 100%.
Determining the encoding is very fast, for example, text longer than 1,300,000 Cyrillic characters is checked in 0.00096 sec. (on my computer)
Link to the idea: http://patttern.blogspot.com/2012/07/php-python.html
Installation
Composer (recommended) Use Composer to install this library from Packagist: onnov/captcha
Run the following command from your project directory to add the dependency:
Alternatively, add the dependency directly to your composer.json file:
The classes in the project are structured according to the PSR-4 standard, so you can also use your own autoloader or require the needed files directly in your code.
Usage
-
Definition of text encoding:
-
Method for converting text of an unknown encoding into a given encoding, by default in utf-8 optional parameters:
-
Method to enable encoding definition:
-
Method to disable encoding definition:
-
Method for adding custom encoding:
- Method to get a custom encoding range:
Tests and examples for the project
Symfony use
Add in services.yaml file:
All versions of detect-encoding with dependencies
ext-iconv Version *