Download the PHP package codeinc/query-tokens-extractor without Composer
On this page you can find all versions of the php package codeinc/query-tokens-extractor. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download codeinc/query-tokens-extractor
More information about codeinc/query-tokens-extractor
Files in codeinc/query-tokens-extractor
Package query-tokens-extractor
Short Description Extract tokens from a query
License MIT
Homepage https://github.com/codeinchq/query-tokens-extractor
Informations about the package query-tokens-extractor
Query tokens extractor
Extract tokens from a query using regex defined tokens. The library is written in PHP 8.2.
Installation
The package is available on Packagist and can be installed using Composer:
Usage
The above exemple will generate the following output:
Token types
Available token types
WordType
: extract words from the queryYearType
: extract years from the queryFrenchPhoneNumberType
: extract French phone numbers from the queryFrenchPostalCodeType
: extract French postal codes from the queryHashtagType
: extract hashtags from the queryRegexTokenType
: extract tokens from the query using a regex
Token type priority
The token type priority is determined by the order in which the token types are passed to the QueryTokensExtractor
constructor.
The priority is used to determine the order in which the tokens are extracted. The higher the priority, the sooner the token will be extracted.
⚠️ The WordType
should always be used last as it will match any string.
Creating custom token types
Custom token types can be created by instantiating or extending RegexTokenType
. The constructor of RegexTokenType
takes four arguments:
string $name
: the name of the token typestring $regex
: the regex used to extract the token\Closure $valueFormatter
: a closure used to format the extracted value (optional)
The regexp value
capturing group is used as the extracted value (for instance the HashtagType
type uses the regex '/^#(?<value>.[a-z0-9_]+)/ui'
). If no group named value
is defined, the whole match is used as the token value.
The regexp should always start with ^
and do not constrain the end of the string with $
as the query is split into tokens using the preg_replace_callback()
function.
Token value formatting
The extracted token value can be formatted using the valueFormatter
closure. The closure takes the extracted value as argument and must return the formatted value.
License
This library is published under the MIT license (see the LICENSE file).