Download the PHP package s9e/regexp-builder without Composer
On this page you can find all versions of the php package s9e/regexp-builder. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download s9e/regexp-builder
More information about s9e/regexp-builder
Files in s9e/regexp-builder
Package regexp-builder
Short Description Library that generates regular expressions that match a list of strings.
License MIT
Homepage https://github.com/s9e/RegexpBuilder/
Informations about the package regexp-builder
s9e\RegexpBuilder is a library that generates a regular expression that matches a given list of strings. It is best suited for efficiently finding a list of keywords inside of a text.
In practical terms, given ['foo', 'bar', 'baz']
as input, the library will generate ba[rz]|foo
, a regular expression that can match any of the strings foo
, bar
, or baz
. It can generate regular expressions for different regexp engines used in various programming languages such as PHP, JavaScript, and others.
Installation
Add s9e/regexp-builder
to your Composer dependencies.
Usage
The simplest way to use the library is to obtain a Builder
instance from one of the existing factories. The builder's build()
method accepts a list of strings as input and returns a regular expression that matches them.
Factories
A factory is a static class that creates a Builder
instance configured for a specific use case. All of the factories have a static getBuilder()
method. Some of them accept optional arguments.
The following factories can be used to generate regular expressions for the corresponding programming language. The Builder
instance will generate a regexp using only printable ASCII characters, while other characters will be escaped according to the regexp engine's syntax. The list of factories along with their optional arguments (with their default value) is as follows:
PHP
modifiers: ''
- Pattern modifiers used for the regexp, e.g.isu
delimiter: '/'
- Delimiter(s) used for the regexp, e.g.#
or()
Java
JavaScript
flags: ''
- Flags used for the RegExp object
RE2
In addition, two factories RawBytes
and RawUTF8
exist. They can be used to generate smaller regexps without any restrictions on the characters used, respectively using bytes and UTF-8 characters as base unit. The resulting regexp should be treated as binary and is not recommended for use in human-readable code.
Examples
Create a PHP (PCRE2) regexp
The following example shows how to create a PHP regexp that matches ☺
(U+263A) or ☹
(U+2639), with or without the u
flag.
Create a JavaScript regexp
The following example shows that you can replace the factory with the JavaScript factory to create JavaScript regexps, with or without the u
flag.
Using meta sequences
User-defined sequences can be used to represent arbitrary expressions in the input strings. The sequence can be composed of one or more characters. The expression it represents must be valid on its own. For example, .*
is valid but not +
.
In the following example, we emulate Bash-style jokers by mapping ?
to .
and *
to .*
.
In the following example, we map \d
(in the input) to \d
(in the output) to emulate the escape sequence of a regular expression. Note that they do not have to be identical and we may choose to map *
to \d
or \d
to [0-9]
instead.
Alternatively, the meta
property can also be set as a promoted constructor parameter as follows.
Using regular expressions in input
As an alternative to meta sequences, it is possible to identify parts of the input that are meant to be interpreted as a regular expression rather than a literal. This is done by passing an array instead of a string literal when building a regexp. The array must contain 0 or more string literals for the literal parts, and 0 or more instances of s9e\RegexpBuilder\Expression
for the regular expressions.
In the following example, we build a regexp to be used for URL routing. We want to match the following routes, expressed here as regexps:
/(*:home)
/admin(*:admin_index)
/admin/login(*:admin_login)
/admin/logout(*:admin_logout)
/admin/product(*:admin_product_store)
/admin/product/(\d+)(*:admin_product_show)
/admin/product/(\d+)/edit(*:admin_product_edit)
/shop(*:shop_index)
/shop/product(*:shop_product_index)
/shop/product/(\d+)(*:shop_product_show)
All versions of regexp-builder with dependencies
php Version >=8.1