Download the PHP package yooper/php-text-analysis without Composer
On this page you can find all versions of the php package yooper/php-text-analysis. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download yooper/php-text-analysis
More information about yooper/php-text-analysis
Files in yooper/php-text-analysis
Package php-text-analysis
Short Description PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
License MIT
Informations about the package php-text-analysis
php-text-analysis
PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language. There are tools in this library that can perform:
- document classification
- sentiment analysis
- compare documents
- frequency analysis
- tokenization
- stemming
- collocations with Pointwise Mutual Information
- lexical diversity
- corpus analysis
- text summarization
All the documentation for this project can be found in the book and wiki.
PHP Text Analysis Book & Wiki
A book is in the works and your contributions are needed. You can find the book at https://github.com/yooper/php-text-analysis-book
Also, documentation for the library resides in the wiki, too. https://github.com/yooper/php-text-analysis/wiki
Installation Instructions
Add PHP Text Analysis to your project
Tokenization
You can customize which type of tokenizer to tokenize with by passing in the name of the tokenizer class
The default tokenizer is \TextAnalysis\Tokenizers\GeneralTokenizer::class . Some tokenizers require parameters to be set upon instantiation.
Normalization
By default, normalize_tokens uses the function strtolower to lowercase all the tokens. To customize the normalize function, pass in either a function or a string to be used by array_map.
Frequency Distributions
The call to freq_dist returns a FreqDist instance.
Ngram Generation
By default bigrams are generated.
Customize the ngrams
Stemming
By default stem method uses the Porter Stemmer.
You can customize which type of stemmer to use by passing in the name of the stemmer class name
Keyword Extract with Rake
There is a short cut method for using the Rake algorithm. You will need to clean your data prior to using. Second parameter is the ngram size of your keywords to extract.
Sentiment Analysis with Vader
Need Sentiment Analysis with PHP Use Vader, https://github.com/cjhutto/vaderSentiment . The PHP implementation can be invoked easily. Just normalize your data before hand.
Document Classification with Naive Bayes
Need to do some document classification with PHP, trying using the Naive Bayes implementation. An example of classifying movie reviews can be found in the unit tests
All versions of php-text-analysis with dependencies
yooper/stop-words Version ~1
symfony/console Version >= 4.4
wamania/php-stemmer Version ^1.0 || ^2.0 || ^3.0
yooper/nicknames Version ~1