Download the PHP package bakame/html-table without Composer
On this page you can find all versions of the php package bakame/html-table. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package html-table
HTML Table
bakame/html-table is a small PHP package that allows you to parse, import and manipulate
tabular data represented as HTML Table. Once installed, you will be able to do the following:
System Requirements
league\csv 9.25.0 library is required. (since version 0.6.0).
Installation
Use composer:
Documentation
The Parser can convert a file (a PHP stream or a Path with an optional context like fopen)
or an HTML document into a League\Csv\TabularData implementing object. Once converted you
can use all the methods and feature made available by the interface (see ResultSet)
for more information.
The Parser itself is immutable, whenever you change a configuration option a new instance is returned.
The Parser constructor is private to instantiate the object you are required to use the new method instead
parseHtml and parseFile
To extract and parse your table use either the parseHtml or parseFile methods.
If parsing is not possible a ParseError exception will be thrown.
parseHtml parses an HTML page represented by:
- a
string, - a
Stringableobject, - a
DOMDocument, - a
DOMElement, - or a
SimpleXMLElement
whereas parseFile works with:
- a filepath,
- or a PHP readable stream.
Both methods return a Table instance which implements the League\Csv\TabularDataReader
interface and also give access to the table caption if present via the getCaption method.
Default configuration
By default, when calling the new Parser() the parser will:
- try to parse the first table found in the page
- expect the table header row to be the first
trfound in thetheadsection of your table - exclude the table
theadsection when extracting the table content. - ignore XML errors.
- have no formatter attached.
- have no default caption to be used if none is present in the table.
Each of the following settings can be changed to improve the conversion against your business rules:
tablePosition and tableXpathPosition
Selecting the table to parse in the HTML page can be done using two (2) methods
Parser::tablePosition and Parser::tableXpathPosition
If you know the table position in the page in relation with its integer offset or if
you know it's id attribute value you should use Parser::tablePosition otherwise
favor Parser::tableXpathPosition which expects an xpath expression.
If the expression is valid, and a list of table is found, the first result will be returned.
Parser::tableXpathPosition and Parser::tablePosition override each other. It is
recommended to use one or the other but not both at the same time.
tableCaption
You can optionally define a caption for your table if none is present or found during parsing.
tableHeader, tableHeaderPosition, ignoreTableHeader and resolveTableHeader
The following settings configure the Parser in relation to the table header. By default,
the parser will try to parse the first tr tag found in the thead section of the table.
But you can override this behaviour using one of these settings:
tableHeaderPosition
Tells where to locate and resolve the table header
The method uses the Bakame\HtmlTable\Section enum to designate which table section to use
to resolve the header
If Section::tr is used, tr tags will be used independently of their section.
The second argument is the table header tr offset; it defaults to 0 (ie: the first row).
ignoreTableHeader and resolveTableHeader
Instructs the parser to resolve or not the table header using tableHeaderPosition configuration.
If no resolution is done, no header will be included in the returned Table instance.
tableHeader
You can directly specify the header of your table and override any other table header related configuration with this configuration
If you specify a non-empty array as the table header, it will take precedence over any other table header related options.
Because it is tabular data, each cell MUST be unique otherwise an exception will be thrown
You can skip or re-arrange the source columns by skipping them by their offsets and/or by re-ordering the offsets.
includeSection and excludeSection
Tells which section should be parsed based on the Section enum
By default, the thead section is not parse. If a thead row is selected to be the header, it will
be parsed independently of this setting.
⚠️Tips: to be sure of which sections will be modified, first remove all previous settings before applying your configuration as shown below:
The first call will still include the tfoot and the tr sections, whereas the second call
removes any previous setting guaranting that only the tbody if present will be parsed.
withFormatter and withoutFormatter
Add or remove a record formatter applied to the data extracted from the table before you can access it. The header is not affected by the formatter if it is defined.
The formatter closure signature should be:
If a header was defined or specified, the submitted record will have the header definition set; otherwise an array list is provided.
The following formatter will work on any table content as long as it is defined as a string.
The following formatter will only work if the table has a header attached to it with
a column named count.
ignoreXmlErrors and failOnXmlErrors
Tells whether the parser should ignore or throw in case of malformed HTML content.
Testing
The library:
- has a PHPUnit test suite
- has a coding style compliance test suite using PHP CS Fixer.
- has a code analysis compliance test suite using PHPStan.
To run the tests, run the following command from the project folder.
Security
If you discover any security related issues, please email [email protected] instead of using the issue tracker.
Credits
License
The MIT License (MIT). Please see License File for more information.
All versions of html-table with dependencies
ext-libxml Version *
ext-mbstring Version *
ext-simplexml Version *
bakame/aide-enums Version ^0.1.0
bakame/aide-error Version ^0.2.0
league/csv Version ^9.23.0