PHP download

Download the PHP package infoxy/utext without Composer

On this page you can find all versions of the php package infoxy/utext. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

Table of contents
Download infoxy/utext
More information about infoxy/utext
Files in infoxy/utext

Vendor infoxy
Package utext
Short Description Tiny set of PHP text utility classes.
License MIT
Homepage https://github.com/infoxy/utext

Keywords drupal

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:

If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.

Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
To use Composer is sometimes complicated. Especially for beginners.
Composer needs much resources. Sometimes they are not available on a simple webspace.
If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.

Please rate this library. Is it a good library?

Example code of infoxy/utext

Informations about the package utext

utext

Tiny set of PHP text utility classes.

Purposes and intro

Requirements:

PHP intl extension or polyfills for Normalizer class and idna functions.

Class list (all classes placed in \infoxy\utext namespace):

PlainFilter: Plain text filter and corresponded utilities.
PlainSimpler: Filter for simplify plain unicode text.
HtmlBase: Collection of static functions for DOMDocument manipulations.
- String to DOM and back
- Class checking
- DOM elements manipulations
IdnaURL: International domain names normalization and humanization class.

First two classes can be used as standalone, and latter based on its.

Purposes and intro

All editors, copywriters, users have different skills in html and unicode. Somebody type text in notepads, anothers type in word processors or in some advanced publishing platforms, all can made copy-paste from foreign sources and so on.

As result in real life: many pieces of simple utf-8 text (in site's database for example) can be very different in formatting and technical quality:

can contain invalid utf-8 byte sequences;
can be mixture of composed and decomposed unicode chars;
with or without encoded html entities;
with or without denormalized whitespaces;
with special spaces (spations, fixed-width spaces) that can be nice for printable papers, but really bad things then copypasted on the web pages;
with special dashes, hyphens or other symbols that can be unreleased in the used fonts.

This makes pieces of text harder to search and ugly to look. PlainFilter filter can be used to transform plain text to the some more normalized and clean form (based on specified options) and also provide some additional services like tags stripper and pattern usage. See PlainFilter section for details.

PlainFilter

Basic filtration

There are list of filter options in "logical pipeline" order:

filter_utf8 Bypass only correct utf8 chars, strip out any invalid byte sequences.

Note: there exists static method PlainFilter::filter_utf8($s) that can be used explicitly.

newline_tags Insert \r\n before every <. Useful with strip_tags to produce non-word-jouned plain text from html or xml. For example: "<em>yellow</em><b>green</b>" will lead to "yellow green" instead of "yellowgreen".

strip_tags Stip tags. Can be used with newline_tags.

decode_entities Decode html entities. All encoded entities like — or ∫ will be decoded to appropriate unicode chars.

lang_quotes Replace double quotes with language-specific ones. This is simple language-based quote marks substitutor.

Supported language ids are: en, de, ru, fr. Other cases falls to en in current release.

Note 1: In this release lang_quotes just lookup to word boundaries, so some nested and spaced double quotes can be handled incorrectly. But even now this can be really helpful text authoring tool.

Note 2: This only option that needed language-specific settings with setLangId().

Note 3: lang_quotes option can produce additional spaces around quotes when its needed by language rules.

simplify_dashes Simplify dashes to hyphen-minus (u+2D), en-dash (u+2013) or em-dash (u+2014). Many fonts do not have full unicode set of dashes. This option can be used to produce:

hyphen-minus from u+2010 (hyphen), u+2011 (non-breaking hyphen)
en-dash from u+2012 (figure dash),
em-dash from u+2015 (horizontal bar), u+2E3A (two-em dash), u+2E3B (three-em dash).

Note: Mathematical minus u+2212 (minus sign) and language-specific hyphens/dashes leave unchanged.

shy_pattern Replace shy pattern in "TeX" style \- with u+AD (soft-hyphen).

dash_patterns Replace dash patterns. This options define usage of em-dash and en-dash patterns in "TeX" style:

-- to u+2013 (en-dash),
--- to u+2014 (em-dash).

replace_triple_dots Replace triple dots with u+2026 (ellipsis). It can be used in conjunction with trim_dots.

replace_quotes Replace straight single and double quotes with curly quotes.

' to u+2019 (hi-9 quote-mark),
" to u+201D (hi-99 quote-mark).

replace_specials Replace special chars with safe fallbacks. A bit ugly yet bulletproof solution agains html special chars. Replace:

& to + (plus sign),
< to u+2039 (left-pointing single angular quote-mark),
> to u+203A (right-pointing single angular quote-mark).

Note: Fullwidth chars (like fullwidth ampersand) are not widely supported by fonts, so we do not use it as replacements.

simplify_spaces Simplify spaces. Replace whitespaces, u+A0 (nbsp), u+2000 to u+200A (fixed width spaces), u+202F (narrow nbsp), u+205F (medium math space) with simple spaces.

collapse_spaces Replace sequence of whitespaces with single space.

zebra_spaces Replace pair of spaces with nbsp+space.

Note: collapse_spaces have precedence over zebra_spaces by its logics.

trim Trim leading and trailing whitespaces.

trim_dots Trim leading and trailing dots. Can be useful for titles and description fields in some scenarios. Use in conjunction with replace_triple_dots to trim dots, but not triple dots.

normalize Normalize unicode string to one of normalization form. Normalization Form C aka NFC is default form.

Most of filtration options are the flags for PlainFilter::setOptions(). But some addinional settings can be done with special setters and getters:

setLangId($lang_id = NULL): set language for language-specific options, default is 'en'. getLangId(): get language for language-specific options.

setNormalForm($nf = 'NFC'): set unicode normalization form. Can be 'NFC', 'NFD', 'NFKC', 'NFKD'. Default is 'NFC'. getNormalForm(): get current unicode normalization form.

Filter escaping

Filter escaping provide ability for filter string multiple times, for example in edit-by-user scenarios. Main things in escaping are restoring & entity for ampersand and/or restore patterns then needed. Re-filtered string will not be damaged by double-decoding.

PlainSimpler

Unicode plain text simplifier.

PlainSimpler::simplify($s, $lang)

Simplify unicode plain text. Typically it is not what you want to expose to end users. Simplified text can be used to improve search queries and string comparing.

More deeply simplify() do:

Decode html entities;
Decomposite digraphs and ligatures by normalizing to NFKD
Additional language-based decomposition for umlauts, AE, ets (latin based).
Preserve some specific diacritic combinations (cyrillic 'Й').
Remove all other diacritics.
Finally, normalize to NFC.

Note: PlainSimpler can be used as next stage after PlainFilter.

PlainSimpler example

HtmlBase

Collection of static functions for DOMDocument manipulation. So you do not need to create HtmlBase objects to use methods.

Note: toText() and toDom() is focused on import/export in-body html tags, not for full documents with embedded scripts, styles and CDATA sections.

String to DOM and back

HtmlBase::toDom($s) Create HTML DOMDocument from string $s, that defines body content for created document. Return DOMElement body for created document.

HtmlBase::toText($e) Export content of DOMElement $e into the string. Return html as string.

Class checking

HtmlBase::classCheck($s) Check then string $s is acceptable as class list. In current version it means that $s contain mixture of alphanumerics, '-', underscore and space. Return TRUE if check passed or FALSE in other case.

HtmlBase::classArray($s) Explode string $s to class names. Return array of string (or empty array if no classes).

HtmlBase::classPat($classes) Generate pattern to match against specified classes. $classes: array of class names or string of class names.

Usage example

DOM elements manipulations

HtmlBase::tagStrip($e) Strip tag (DOMElement) $e, reattach children to it's parent. Return (DOMNode) first reattached child or NULL if no child or $e don't have parent.

HtmlBase::tagWrap($e, $tag) Wrap (DOMElement) $e with new DOMElement with (string) $tag name. Return newly created DOMElement.

HtmlBase::tagReplace($e, $tag) Replace (DOMElement) $e with new DOMElement with (string) $tag name and reattach children to it. id, class, lang, dir attributes are also copied to new element. Return newly created DOMElement

HtmlBase::contentWrap($e,$tag) Wrap (DOMElement)$e children with specified tag Return: newly created DOMElement

IdnaURL

... in progress ...

All versions of utext with dependencies

PHP Build Version

Package Version

Version 1.1.2 Release 06. Sep 2021
create-project require 0 people chose require and
0 people chose create-project.

Download

Download latest version of utext from vendor infoxy

Requires php Version >=5.5.0

Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package infoxy/utext contains the following files

Loading the files please wait ....