Download the PHP package nitotm/efficient-language-detector without Composer

On this page you can find all versions of the php package nitotm/efficient-language-detector. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package efficient-language-detector

Efficient Language Detector

![supported PHP versions](https://img.shields.io/badge/PHP-%3E%3D%207.4-blue) [![license](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0) [![supported languages](https://img.shields.io/badge/supported%20languages-60-brightgreen.svg)](#languages)

Efficient language detector (Nito-ELD or ELD) is a fast and accurate natural language detection software, written in PHP, with a speed comparable to existent fast C++ compiled detectors, and accuracy within the range of the heaviest and slowest detectors.

It has no dependencies, 100% PHP, easy installation, all it's needed is PHP with the mb extension.
ELD is also available in Javascript and Python.

  1. Installation
  2. How to use
  3. Benchmarks
  4. Testing
  5. Languages

Installation

Alternatively, download / clone the files will work just fine.

How to use?

detect() expects a UTF-8 string, returns an object, with a value (ISO 639-1 code or null) named language

If needed, we can get the current status of eld: languages, database type and subset

Benchmarks

I compared ELD with a different variety of detectors, since the interesting part is the algorithm.

URL Version Language
https://github.com/nitotm/efficient-language-detector/ 1.0.0 PHP
https://github.com/pemistahl/lingua-py 1.3.2 Python
https://github.com/CLD2Owners/cld2 Aug 21, 2015 C++
https://github.com/google/cld3 Aug 28, 2020 C++
https://github.com/wooorm/franc 6.1.0 Javascript
https://github.com/patrickschur/language-detection 5.2.0 PHP

Benchmarks: Tweets: 760KB, short sentences of 140 chars max.; Big test: 10MB, sentences in all 60 languages supported; Sentences: 8MB, this is the Lingua sentences test, minus unsupported languages.
Short sentences is what ELD and most detectors focus on, as very short text is unreliable, but I included the Lingua Word pairs 1.5MB, and Single words 880KB tests to see how they all compare beyond their reliable limits.

These are the results, first, execution time and then accuracy.

time table accuracy table

1. Lingua could have a small advantage as it participates with 54 languages, 6 less.
2. CLD2 and CLD3, return a list of languages, the ones not included in this test where discarded, but usually they return one language, I believe they have a disadvantage. Also, I confirm the results of CLD2 for short text are correct, contrary to the test on the Lingua page, they did not use the parameter "bestEffort = True", their benchmark for CLD2 is unfair.

Lingua is the average accuracy winner, but at what cost, the same test that in ELD or CLD2 lasts 2 seconds, in Lingua takes more than 5 hours! It acts like a brute-force software. Also, its lead comes from single and pair words, which are unreliable regardless.

I added ELD-L for comparison, which has a 2.3x bigger database, but only increases execution time marginally, a testament to the efficiency of the algorithm. ELD-L is not the main database as it does not improve language detection in sentences.

Here is the average, per benchmark, of Tweets, Big test & Sentences.

Sentences tests average

Testing

Languages

These are the ISO 639-1 codes of the 60 supported languages for Nito-ELD v1

am, ar, az, be, bg, bn, ca, cs, da, de, el, en, es, et, eu, fa, fi, fr, gu, he, hi, hr, hu, hy, is, it, ja, ka, kn, ko, ku, lo, lt, lv, ml, mr, ms, nl, no, or, pa, pl, pt, ro, ru, sk, sl, sq, sr, sv, ta, te, th, tl, tr, uk, ur, vi, yo, zh

Full name languages:

Amharic, Arabic, Azerbaijani (Latin), Belarusian, Bulgarian, Bengali, Catalan, Czech, Danish, German, Greek, English, Spanish, Estonian, Basque, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Armenian, Icelandic, Italian, Japanese, Georgian, Kannada, Korean, Kurdish (Arabic), Lao, Lithuanian, Latvian, Malayalam, Marathi, Malay (Latin), Dutch, Norwegian, Oriya, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovene, Albanian, Serbian (Cyrillic), Swedish, Tamil, Telugu, Thai, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Yoruba, Chinese

Future improvements

Donate / Hire
If you wish to Donate for open source improvements, Hire me for private modifications / upgrades, or to Contact me, use the following link: https://linktr.ee/nitotm


All versions of efficient-language-detector with dependencies

PHP Build Version
Package Version
Requires php Version ^7.4 || ^8.0
ext-mbstring Version *
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package nitotm/efficient-language-detector contains the following files

Loading the files please wait ....