Download the PHP package muchrm/php-html-parser without Composer

On this page you can find all versions of the php package muchrm/php-html-parser. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package php-html-parser

PHP Html Parser

Version 1.7.0

Build Status Coverage Status Scrutinizer Code Quality

PHPHtmlParser is a simple, flexible, html parser which allows you to select tags using any css selector, like jQuery. The goal is to assist in the development of tools which require a quick, easy way to scrap html, whether it's valid or not! This project was original supported by sunra/php-simple-html-dom-parser but the support seems to have stopped so this project is my adaptation of his previous work.

Install

This package can be found on packagist and is best loaded using composer. We support php 5.6, 7.0, 7.1.

Usage

You can find many examples of how to use the dom parser and any of its parts (which you will most likely never touch) in the tests directory. The tests are done using PHPUnit and are very small, a few lines each, and are a great place to start. Given that, I'll still be showing a few examples of how the package should be used. The following example is a very simplistic usage of the package.

The above will output "click here". Simple no? There are many ways to get the same result from the dome, such as $dom->getElementsbyTag('a')[0] or $dom->find('a', 0) which can all be found in the tests or in the code itself.

Loading Files

You may also seamlessly load a file into the dom instead of a string, which is much more convenient and is how I except most developers will be loading the html. The following example is taken from our test and uses the "big.html" file found there.

This example loads the html from big.html, a real page found online, and gets all the content-border classes to process. It also shows a few things you can do with a node but it is not an exhaustive list of methods that a node has available.

Alternativly, you can always use the load() method to load the file. It will attempt to find the file using file_exists and, if successful, will call loadFromFile() for you. The same applies to a URL and loadFromUrl() method.

Loading Url

Loading a url is very similar to the way you would load the html from a file.

What makes the loadFromUrl method note worthy is the PHPHtmlParser\CurlInterface parameter, an optional second parameter. By default, we use the PHPHtmlParser\Curl class to get the contents of the url. On the other hand, though, you can inject your own implementation of CurlInterface and we will attempt to load the url using what ever tool/settings you want, up to you.

As long as the Connector object implements the PHPHtmlParser\CurlInterface interface properly it will use that object to get the content of the url instead of the default PHPHtmlParser\Curl class.

Loading Strings

Loading a string directly, with out the checks in load() is also easily done.

If the string is to long, depending on your file system, the load() method will throw a warning. If this happens you can just call the above method to bypass the is_file() check in the load() method.

Options

You can also set parsing option that will effect the behavior of the parsing engine. You can set a global option array using the setOptions method in the Dom object or a instance specific option by adding it to the load method as an extra (optional) parameter.

At the moment we support 8 options.

Strict

Strict, by default false, will throw a StrickException if it find that the html is not strictly compliant (all tags must have a closing tag, no attribute with out a value, etc.).

whitespaceTextNode

The whitespaceTextNode, by default true, option tells the parser to save textnodes even if the content of the node is empty (only whitespace). Setting it to false will ignore all whitespace only text node found in the document.

enforceEncoding

The enforceEncoding, by default null, option will enforce an character set to be used for reading the content and returning the content in that encoding. Setting it to null will trigger an attempt to figure out the encoding from within the content of the string given instead.

cleanupInput

Set this to false to skip the entire clean up phase of the parser. If this is set to true the next 3 options will be ignored. Defaults to true.

removeScripts

Set this to false to skip removing the script tags from the document body. This might have adverse effects. Defaults to true.

removeStyles

Set this to false to skip removing of style tags from the document body. This might have adverse effects. Defaults to true.

preserveLineBreaks

Preserves Line Breaks if set to true. If set to false line breaks are cleaned up as part of the input clean up process. Defaults to false.

removeDoubleSpace

Set this to false if you want to preserver whitespace inside of text nodes. It is set to true by default. Static Facade

You can also mount a static facade for the Dom object.

The above php block does the same find and load as the first example but it is done using the static facade, which supports all public methods found in the Dom object.

Modifying The Dom

You can always modify the dom that was created from any loading method. To change the attribute of any node you can just call the setAttribute method.

You may also get the PHPHtmlParser\Dom\Tag class directly and manipulate it as you see fit.

It is also possible to remove a node from the tree. Simply call the delete method on any node to remove it from the tree. It is important to note that you should unset the node after removing it from the `DOM``, it will still take memory as long as it is not unset.

You can modify the text of TextNode objects easely. Please note that, if you set an encoding, the new text will be encoded using the existing encoding.


All versions of php-html-parser with dependencies

PHP Build Version
Package Version
Requires php Version >=5.4
paquettg/string-encode Version ~0.1.0
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package muchrm/php-html-parser contains the following files

Loading the files please wait ....