Download the PHP package affinity4/tokenizer without Composer

On this page you can find all versions of the php package affinity4/tokenizer. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package tokenizer

Tokenizer

Affinity4

A zero-dedpendency tokenizer written in PHP. Returns an easily navigatable Stream object of Token objects with public type, value, offset and length properties

Simply pass an associative array [match_pattern => type] ('\s+' => 'T_WHITESPACE', '[a-zA-Z]\w+' => 'T_STRING'), and the Tokenizer will return all matches as an array of Token objects

Installation

Composer

composer require affinity4/tokenizer

Basic Example

Let's assume we want to create a DSL (Domain Specific Language) for a template engine language that looks more like code, instead of markup

Example template snippet:

Now we define our "lexicon", which is passed to the tokenizer:

NOTE:
The lexicon must supply all characters and patterns you expect to encounter in your grammar. Currently you cannot skip any characters. Everything must be tokenized, whether you use it later or not.

We pass the lexicon to the tokenizer...

From here you just need to write your "finite automata" and or/your parser.

TIPS

debug()

The Tokenizer has a debug() method, which will return the compiled regex, for you to examine.

TIP:
A good website for testing PHP regexes is: https://regexr.com/

The debug method will by default return the regex as a string, however, you can also echo, var_dump and "dump and die" (or dd() for you Laravel users).

There are constants defined for all of these to help you avoid using the switches for these

preg_match_all(): Compilation failed: missing closing parenthesis at offset x

Attempting to match backslashes, or newline chars (e.g. \r|\n|\r\n) is most likely the cause of your troubles.

You will need to double escape backslashes. To help you avoid needing to figure this out I have provided the correct regex patterns for T_ESCAPE_CHAR.

Newlines

Newlines will need to be replaced with a token before they can be matched. By default the T_NEWLINE_ALL constant will match ;T_NEWLINE;

If you need to match individual newline characters for a specific environment you can use the following constants

See the following section on Matching Backslashes and Special Characters if you want more info

Matching Backslashes and Special Characters

As mentioned above, backslashes must be double escaped.

So to match a single backslash your must use the regex '\\\\' (I know, it sucks, but you have to)

To match special characters (tabs, newlines, cariage returns etc) you will need to replace them with another token first, and then add a token for the replacement string.

I am working on some better detection internally for these patterns and attempt to provide better error messages when these errors are encountered (I'll go real meta and regex the regex before it's ran or something)


All versions of tokenizer with dependencies

PHP Build Version
Package Version
Requires php Version >=8.0
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package affinity4/tokenizer contains the following files

Loading the files please wait ....