Download the PHP package lucleroy/php-regex without Composer
On this page you can find all versions of the php package lucleroy/php-regex. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download lucleroy/php-regex
More information about lucleroy/php-regex
Files in lucleroy/php-regex
Package php-regex
Short Description PHP Regular Expressions Builder
License LGPL-3.0
Homepage https://github.com/lucleroy/php-regex
Informations about the package php-regex
FluentRegex
PHP library with fluent interface to build regular expressions.
Table of contents
- Introduction
- Requirements
- Installation
- Usage
- Workflow
- Literal Characters
- Character Sets
- Match any character
- Anchors
- Alternation
- Quantifiers
- Greedy, Lazy, Possessive Quantifiers
- Grouping and Capturing
- Backreferences
- Atomic grouping
- Lookahead, Lookbehind
- Conditionals
- Recursion
- Case sensitivity
- Special Expressions
Introduction
Here is a simple example that creates a regular expression to recognize a PHP hexadecimal number (example: 0x1ff).
This code is equivalent to:
Requirements
PHP 5.5 or more.
Installation (with Composer)
Add the following to the require
section of your composer.json file
and run composer update
.
Usage
Workflow
Create a Regex object with Regex::create
:
Build the regular expression:
Retrieve the PHP Regular Expression string:
By default, the resulting string is surrounded with '/'. You can change this character:
The choosen character is automatically escaped:
Note: when you convert a Regex instance to a string, you get the raw regular expression string. With the preceding example :
Literal Characters
Use Regex::literal
to match literal characters. Special characters are automatically escaped:
The expression created by Regex::literal
is indivisible: when you put a
quantifier next to it, it applies to the whole expression and not only to the
last character:
Character Sets
Use Regex::chars
to match chars in a character set. Use two dots to specify a
range of characters.
If you want to match characters that are not in a specified set, use Regex::notChars
:
If you need to add special characters to a character set, you can provide an
instance of Charset
to the methods Regex::chars
and Regex::notChars
. For
example, the following code matches letters and tabulations:
You can use the following methods to match non-printable characters:
Character | ASCII | Method |
---|---|---|
tab | 0x09 | tab |
carriage return | 0x0D | cr |
line feed | 0x0A | lf |
bell | 0x07 | bell |
escape | 0x1B | esc |
form feed | 0x0C | ff |
vertical tab | 0x0B | vtab |
backspace | 0x08 | backspace |
You can use shorthands for common character classes:
Character Class | Method |
---|---|
digit | digit |
word character | wordChar |
whitespace character | whitespace |
not digit | notDigit |
not word character | notWordChar |
not whitespace character | notWhitespace |
In addition, you can pass a base (from 2 to 26) to Charset::digit
and Charset::notDigit
:
You can match control characters (ASCII codes from 1 to 26) with Charset::control
:
You can match an ANSI character with Charset::ansi
:
You can match a range of ANSI characters with Charset::ansiRange
:
Finally, Charset
provides some methods to work with Unicode characters.
Use Charset::extendedUnicode
to match a Unicode grapheme:
Use Charset::unicodeChar
to match a specific unicode point:
Use Charset::unicodeCharRange
to match a range of unicode points:
Use Charset::unicode
to match a a Unicode class or category. For your convenience,
a Unicode class with Unicode properties is provided:
Note : all the methods of Charset
are available in Regex
:
Match any character
If you want to match any character, use Regex::anyChar
:
Note that the regular expression generated by the previous method matches also newlines.
If you don't want to match newlines, use the method Regex::notNewline
:
Anchors
To match at the start of the string or at the end of the string, use Regex:startOfString
and Regex::endOfString
.
The preceding method matches only at the string ends. If you want
to match at the start of a line or at the end of a line, use Regex:startOfLine
and Regex::endOfLine
.
You can match at a word boundary with Regex::wordLimit
. To match a position
which is not a word boundary, use Regex::notWordLimit
:
Alternation
Use Regex::alt
to create an alternation. There are several ways to provide each
choice.
Firstly, you can pass choices as arguments:
Secondly, you can give to the method the number of choices, which are taken from the previous expressions:
Finally, you can mark the position of the first choice with Regex::start
and give
no argument to the Regex::alt
method:
If you want to create an alternation with literals only, you can use Regex::literalAlt
:
Quantifiers
Use Regex::optional
to match an optional expression:
Use Regex::anyTimes
to match any number of consecutive occurences of the
previous expression:
Use Regex::atLeastOne
to match at least one occurences of the
previous expression:
Use Regex::atLeast
to match a minimum number of occurences of the
previous expression:
Use Regex::between
to match a number of occurences of the
previous expression between two numbers:
Use Regex::times
to match a precise number of occurences of the
previous expression:
Note: instead of add the quantifier to the previous expression, you can provide a Regex instance as last argument of each of these methods.
Greedy, Lazy, Possessive Quantifiers
In the previous examples, the quantifiers are greedy. This is the default
behavior. More precisely, a quantifier can have 4 modes: GREEDY, LAZY, POSSESSIVE,
and UNDEFINED. When the regular expression string is generated, a quantifier
with the UNDEFINED mode is considered as GREEDY. UNDEFINED is the default mode
but you can use Regex::greedy
, Regex::lazy
and Regex::possessive
on an
empty Regex (just after the creation) to modify the default behavior:
The same methods can be used after a quantifier to change its behavior:
You can also change the behavior of all quantifiers of a group:
In the previous example, you can notice that the behavior does not apply to the
optional quantifier. You can use Regex::greedyRecursive
,
Regex::lazyRecursive
and Regex::possessiveRecursive
to apply the behavior
recursively:
When applied to a group, all these methods modify the behavior of a quantifier only if it has the UNDEFINED mode. In the example, if the optional quantifier is set to GREEDY, it retains its behavior:
Grouping and Capturing
By default, when the library needs to create a group, it is not captured. To
capture an expression, you must use Regex::capture
:
To create a named group, give an argument to Regex::capture
:
You can group several expressions with Regex::group
. As with Regex::alt
, you
can specify the expressions to group by using the Regex::start
method or by
giving the number of expressions to group or by giving directly the expression
(a Regex instance):
Backreferences
Use Regex::ref
to make a backreference:
Atomic grouping
Use Regex::atomic
to make an atomic group:
Lookahead, Lookbehind
Use Regex::after
, Regex::notAfter
, Regex::before
, Regex::notBefore
:
Conditionals
Create a conditional with Regex::cond
. This method must be preceded by a
condition, an expression to match when the condition is true, and an optional
expression to match when the condition is false.
Use Regex::match
to check if a captured group matches:
Regex::match
can also be used outside of a conditional. In this case, the
regular expression fails if captured group does not match:
The others allowed conditions are Regex::after
, Regex::notAfter
,
Regex::before
, Regex::notBefore
:
If you want the 'else' expression to match nothing, you can remove the 'else' expression:
If you want the 'then' expression to match nothing, you can use Regex::notCond
to inverse the condition:
You can also use Regex::nothing
:
Case sensitivity
By default, the regular expression is case sensitive. Use Regex::caseSensitive
or Regex::caseInsensitive
to change this behavior. Each of these methods accepts
an optional boolean argument. If this argument is false
, the behavior is
inverted: $regex->caseSensitive(false)
is equivalent to $regex->caseInsensitive()
.
These methods change the behavior of the last expression:
When used at the beginning of the Regex, the whole expression is affected:
Recursion
Use Regex::matchRecursive
to match recursively the whole pattern. This example matches balanced parentheses:
Special Expressions
Regex::crlf
matches a Carriage Return followed by a Line Feed (Windows line breaks):
Regex:unsignedIntRange
matches a nonnegative integer in a given range. The third parameters specify how leading zeros are handled:
Note that in any case, the number of digits cannot exceed the number of digits of the maximum value.