LanguageDetector
, (*1)
LanguageDetector is a PHP library that detects the language from a text
string., (*2)
Table of contents
Features
- More than 50 supported languages, including Klingon
- Very fast, no database needed
- Packaged with a 2MB dataset
- Learning steps are already done, library is ready to use
- Small code, small footprint
- N-grams algorithm
- Supports PHP 5.4+, 7+ and 8+ and HHVM
The latest release 1.4.x only supports PHP>=7.4
Install
composer require landrok/language-detector
Quick usage
Detect language
Instanciate a detector, pass a text and get the detected language., (*3)
require_once 'vendor/autoload.php';
$text = 'My tailor is rich and Alison is in the kitchen with Bob.';
$detector = new LanguageDetector\LanguageDetector();
$language = $detector->evaluate($text)->getLanguage();
echo $language; // Prints something like 'en'
Once it's instanciated, you can test multiple texts., (*4)
require_once 'vendor/autoload.php';
// An array of texts to evaluate
$texts = [
'My tailor is rich and Alison is in the kitchen with Bob.',
'Mon tailleur est riche et Alison est dans la cuisine avec Bob'
];
$detector = new LanguageDetector\LanguageDetector();
foreach ($texts as $key => $text) {
$language = $detector->evaluate($text)->getLanguage();
echo sprintf(
"Text %d, language=%s\n",
$key,
$language
);
}
Would output something like:, (*5)
Text 0, language=en
Text 1, language=fr
Additionally, you can use a LanguageDetector instance as a string., (*6)
require_once 'vendor/autoload.php';
$text = 'My tailor is rich and Alison is in the kitchen with Bob.';
$detector = new LanguageDetector\LanguageDetector();
echo $detector->evaluate($text); // Prints something like 'en'
echo $detector; // Prints something like 'en' after an evaluate()
API Methods
evaluate()
Type \LanguageDetector\LanguageDetector, (*7)
It performs an evaluation on a given text., (*8)
Example, (*9)
After an evaluate(), the result is stored and available for later use., (*10)
$detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.');
// Then you have access to the detected language
$detector->getLanguage(); // Returns 'en'
You can make a one line call., (*11)
$detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.')
->getLanguage(); // Returns 'en'
It's possible to directly print evaluate() output., (*12)
// Returns 'en'
echo $detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.');
getLanguage()
Type string, (*13)
The detected language, (*14)
Example, (*15)
$detector->getLanguage(); // Returns 'en'
getLanguages()
Type array, (*16)
A list of loaded models that will be evaluated., (*17)
Example, (*18)
$detector->getLanguages(); // Returns something like ['de', 'en', 'fr']
getScores()
Type array, (*19)
A list of scores by language, for all evaluated languages., (*20)
Example, (*21)
$detector->getScores();
// Returns something like
Array
(
[en] => 0.43950135722745
[nl] => 0.40898789832569
[...]
[ja] => 0
[fa] => 0
)
getSupportedLanguages()
Type array, (*22)
A list of supported languages that will be evaluated., (*23)
Example, (*24)
$detector->getSupportedLanguages();
// Returns something like
Array
(
[0] => af
[1] => ar
[...]
[51] => zh-cn
[52] => zh-tw
)
getText()
Type string, (*25)
Returns the last string which has been evaluated, (*26)
Example, (*27)
$detector->getText();
// Returns 'My tailor is rich and Alison is in the kitchen with Bob.'
Options
Type \LanguageDetector\LanguageDetector, (*28)
For even better performance, loaded models can be specified explicitly., (*29)
Example, (*30)
$text = 'My tailor is rich and Alison is in the kitchen with Bob.';
$detector = new LanguageDetector(null, ['en', 'fr', 'de']);
$language = $detector->evaluate($text);
echo $language; // Prints something like 'en'
For one-liners only
Type \LanguageDetector\LanguageDetector, (*31)
With a static call on detect() method, you can perform an evaluation on
a given text, in one line., (*32)
Example, (*33)
echo LanguageDetector\LanguageDetector::detect(
'My tailor is rich and Alison is in the kitchen with Bob.'
); // Returns 'en'
You can use all API methods., (*34)
$detector = LanguageDetector\LanguageDetector::detect(
'My tailor is rich and Alison is in the kitchen with Bob.'
);
// en
echo $detector;
// en
echo $detector->getLanguage();
// An array of all scores, see API method
print_r($detector->getScores());
// An array of all supported languages, see API method
print_r($detector->getSupportedLanguages());
// The last evaluated string
echo $detector->getText();
// Limit loaded languages for even better performance
echo LanguageDetector\LanguageDetector::detect(
'My tailor is rich and Alison is in the kitchen with Bob.',
['en', 'de', 'fr', 'es']
); // en