Download the PHP package kuria/parser without Composer
On this page you can find all versions of the php package kuria/parser. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Package parser
Short Description Character-by-character string parsing library
License MIT
Informations about the package parser
Parser ######
Character-by-character string parsing library.
:depth: 2
Features
- line number tracking (can be disabled for performance)
- supports CR, LF and CRLF line endings
- verbose exceptions
- many methods to navigate and operate the parser
- forward / backward peeking and seeking
- forward / backward character consumption
- state stack
- character types
- expectations
Requirements
- PHP 7.1+
Usage
Creating a parser
Create a new parser instance with string input.
The parser begins at the first character.
$input = 'foo bar baz';
$parser = new Parser($input);
Parser properties
The parser has several public properties that can be used to inspect its current state:
$parser->i
- current position$parser->char
- current character (orNULL
at the end of input)$parser->lastChar
- last character (orNULL
at the start of input)$parser->line
- current line (orNULL
if line tracking is disabled)$parser->end
- end of input indicator (TRUE
at the end,FALSE
otherwise)$parser->vars
- user-defined variables attached to the current state
Parser method overview
Refer to doc comments of the respective methods for more information.
Also see Character types.
Static methods
getCharType($char): int
- determine character typegetCharTypeName($charType): string
- get human-readable character type name
Instance methods
getInput(): string
- get the input stringsetInput($input): void
- replace the input string (this also resets the parser)getLength(): int
- get length of the input stringisTrackingLineNumbers(): bool
- see if line number tracking is enabledtype(): int
- get type of the current characteris(...$types): bool
- check whether the current character is of one of the specified typesatNewline(): bool
- see if the parser is at the start of a newline sequenceeat(): ?string
- go to the next character and return the current one (returnsNULL
at the end)spit(): ?string
- go to the previous character and return the current one (returnsNULL
at the beginning)shift(): ?string
- go to the next character and return it (returnsNULL
at the end)unshift(): ?string
- go to the previous character and return it (returnsNULL
at the beginning)peek($offset, $absolute = false): ?string
- get character at the given offset or absolute position (does not affect state)seek($offset, $absolute = false): void
- alter current positionreset(): void
- reset states, vars and rewind to the beginningrewind(): void
- rewind to the beginningeatChar($char): ?string
- consume specific character and return the next charactertryEatChar(): bool
- attempt to consume specific character and return success stateeatType($type): string
- consume all characters of the specified typeeatTypes($typeMap): string
- consume all characters of the specified typeseatWs(): string
- consume whitespace, if anyeatUntil($delimiterMap, $skipDelimiter = true, $allowEnd = false): string
- consume all characters until the specified delimiterseatUntilEol($skip = true): string
- consume all character until end of line or inputeatEol(): string
- consume end of line sequenceeatRest(): string
- consume reamaining charactersgetChunk($start, $end): string
- get chunk of the input (does not affect state)detectEol(): ?string
- find and return the next end of line sequence (does not affect state)countStates(): int
- get number of stored statespushState(): void
- store the current staterevertState(): void
- revert to the last stored state and pop itpopState(): void
- pop the last stored state without reverting to itclearStates(): void
- throw away all stored statesexpectEnd(): void
- ensure that the parser is at the endexpectNotEnd(): void
- ensure that the parser is not at the endexpectChar($expectedChar): void
- ensure that the current character matches the expectationexpectCharType($expectedType): void
- ensure that the current character is of the given type
Example INI parser implementation
/ INI parser (example) / class IniParser { / Parse an INI string / public function parse(string $string): array { // create parser $parser = new Parser($string);
// prepare variables $data = []; $currentSection = null;
// parse while (!$parser->end) { // skip whitespace $parser->eatWs(); if ($parser->end) { break; }
// parse the current thing if ($parser->char === '[') { // a section $currentSection = $this->parseSection($parser); } elseif ($parser->char === ';') { // a comment $this->skipComment($parser); } else { // a key=value pair [$key, $value] = $this->parseKeyValue($parser);
// add to output if ($currentSection === null) { $data[$key] = $value; } else { $data[$currentSection][$key] = $value; }
}
}
return $data;
arse a section and return its name
ate function parseSection(Parser $parser): string
// we should be at the [ character now, eat it $parser->eatChar('[');
// eat everything until ] $sectionName = $parser->eatUntil(']');
return $sectionName;
kip a commented-out line
ate function skipComment(Parser $parser): void
// we should be at the ; character now, eat it $parser->eatChar(';');
// eat everything until the end of line $parser->eatUntilEol();
arse a key=value pair
ate function parseKeyValue(Parser $parser): array
// we should be at the first character of the key // eat characters until = is found $key = $parser->eatUntil('=');
// eat everything until the end of line // that is our value $value = trim($parser->eatUntilEol());
return [$key, $value];
}
Using the parser
$iniString = <<<INI ; An example comment name=Foo type=Bar
[options] size=150x100 onload= INI;
$data = $iniParser->parse($iniString);
print_r($data);
Output:
Array
(
[name] => Foo
[type] => Bar
[options] => Array
(
[size] => 150x100
[onload] =>
)
)
Character types
The table below lists the default character types.
These types are available as constants on the Parser class
:
Parser::C_NONE
- no character (NULL)Parser::C_WS
- whitespace (tab, linefeed, vertical tab, form feed, carriage return and space)Parser::C_NUM
- numeric character (0-9
)Parser::C_STR
- string character (a-z
,A-Z
,_
and any 8-bit char)Parser::C_CTRL
- control character (ASCII 127 and ASCII < 32 except whitespace)Parser::C_SPECIAL
-!"#$%&'()*+,-./:;<=>?@[\\]^\`{|}~
# | Character | Type |
NULL 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 | none 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 \t \n \v \f \r 0x0e 0x0f 0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f 0x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ \` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ 0x7f 0x80 0x81 0x82 0x83 0x84 0x85 0x86 0x87 0x88 0x89 0x8a 0x8b 0x8c 0x8d 0x8e 0x8f 0x90 0x91 0x92 0x93 0x94 0x95 0x96 0x97 0x98 0x99 0x9a 0x9b 0x9c 0x9d 0x9e 0x9f 0xa0 0xa1 0xa2 0xa3 0xa4 0xa5 0xa6 0xa7 0xa8 0xa9 0xaa 0xab 0xac 0xad 0xae 0xaf 0xb0 0xb1 0xb2 0xb3 0xb4 0xb5 0xb6 0xb7 0xb8 0xb9 0xba 0xbb 0xbc 0xbd 0xbe 0xbf 0xc0 0xc1 0xc2 0xc3 0xc4 0xc5 0xc6 0xc7 0xc8 0xc9 0xca 0xcb 0xcc 0xcd 0xce 0xcf 0xd0 0xd1 0xd2 0xd3 0xd4 0xd5 0xd6 0xd7 0xd8 0xd9 0xda 0xdb 0xdc 0xdd 0xde 0xdf 0xe0 0xe1 0xe2 0xe3 0xe4 0xe5 0xe6 0xe7 0xe8 0xe9 0xea 0xeb 0xec 0xed 0xee 0xef 0xf0 0xf1 0xf2 0xf3 0xf4 0xf5 0xf6 0xf7 0xf8 0xf9 0xfa 0xfb 0xfc 0xfd 0xfe 0xff | C_NONE C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_WS C_WS C_WS C_WS C_WS C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_CTRL C_WS C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_NUM C_NUM C_NUM C_NUM C_NUM C_NUM C_NUM C_NUM C_NUM C_NUM C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_STR C_SPECIAL C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_SPECIAL C_SPECIAL C_SPECIAL C_SPECIAL C_CTRL C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR C_STR |
Customizing character types
Character types can be customized by extending the base Parser
class.
The following example changes "-
" and ".
" from CHAR_SPECIAL
to CHAR_STR
and inherits everything else.
var_dump($parser->eatType(CustomParser::C_STR));
Output:
string(11) "foo-bar.baz"