Download the PHP package jpuck/etl without Composer

On this page you can find all versions of the php package jpuck/etl. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package etl

PHP tools for ETL

This is a collection of PHP 7 classes useful for extracting, transforming, and loading data between sources.

Hierarchical XML and JSON can be automatically converted to relational SQL. Support includes extracting data documents from a file system or REST API, and then loading the data into a DBMS like Microsoft SQL Server.

Values are surveyed for datatypes, numeric cardinality, and unique natural key candidates. Then this information is used to create a normalized multi-table database structure suited to insert the data.

Branch Tests Code Coverage
master Build Status Codecov
dev Build Status Codecov

Requirements

PHP >= 7.0

Installation

This library is registered on packagist and can be easily installed into your project using composer.

composer require jpuck/etl

Getting Started

There are 3 basic groups of interrelated classes: Sources provide Data which have Schemata.

  1. Sources

    Sources extend the abstract Source class and transport Datum objects. In particular, the abstract DB class has concrete class implementations such as MicrosoftSQLServer.

  2. Data

    Data classes extend Datum and must implement a valid parser, satisfied by ParseValidator. It uses the Schematizer to construct the object from raw data, which can be overridden by passing an existing Schema.

  3. Schemata

    A Schema is a concrete class with a Validator to enforce structure. The Merger class is for combining Schemas to create super-set Schemas. The DDL trait is used by the DB class to generate SQL Data Definition Language which contains abstract methods to be implemented by a specific database management system.

Schematizer

The Schematizer class is for surveying the structure of the data. It includes node names, the count of distinct element groupings, numeric cardinality for relationships between subnodes, and descriptive statistics about the values including uniqueness. Categorically, it recognizes datetime, integer, and decimal datatypes. Decimals will include scale and precision measurements suitable for SQL.

Schematizer::getPrecision returns the scale and precision of numeric values suitable for the SQL DECIMAL(scale,precision) datatype. This function has notable behavior in that trailing zeros are discarded when passed as raw PHP float types. However, when passed as a string, then the trailing zeros are preserved in the precision. See SchematizerUtilitiesTest::precisionDataProvider for examples. Note that in the XML class, parsed values are represented as strings in PHP, so trailing zeros should be represented in the precision values.

node name
├── count
│   ├── max
│   │   ├── measure
│   │   └── value
│   └── min
│       ├── measure
│       └── value
├── unique (all values)
├── primaryKey
├── varchar          ────┐
│   ├── max              │
│   │   ├── measure      │
│   │   └── value        │
│   └── min              │
│       ├── measure      │
│       └── value        │
├── datetime             ├ datatypes
│   ├── max              │
│   │   └── value        │
│   └── min              │
│       └── value        │
├── int/decimal          │
│   ├── max              │
│   │   └── value        │
│   └── min              │
│       └── value    ────┘
├── scale            ────┐
│   ├── max              │
│   │   ├── measure      │
│   │   └── value        │
│   └── min              │
│       ├── measure      │
│       └── value        │
├── precision            ├ if decimal
│   ├── max              │
│   │   ├── measure      │
│   │   └── value        │
│   └── min              │
│       ├── measure      │
│       └── value    ────┘
├── children
│   ├── distinct (count of children)
│   └── count
│       ├── max
│       │   └── measure
│       └── min
│           └── measure
├── attributes
│   └── ... (excluding count, which must be 1)
└── elements
    └── ...

Database Connections

The DB class requires an instance of PDO in the constructor to connect, but it is possible to pass a null value if only utilizing the class for DDL.

SQL Data Definition Language

When one-to-many XML nodes are used to represent one-to-one relationships, then the Schematizer recognizes this and a DDL class flattens them as columns on a table. If a node has more than one of its name or grandchildren, then the one-to-many relationship is preserved in a separate normalized table. Surrogate keys are created to maintain the Primary/Foreign Key referential integrity.

If the Schema has a primaryKey set, then that field will be used for DDL generation instead of the surrogate. However, this Schema must also be passed to the Datum constructor prior to being used with DB::insert, otherwise the surrogate keys will be used by default and will result in a failed insertion if the surrogate columns don't exist.

Saving Schemas

Generating Schemas can take a long time and may require customization, such as adding primaryKey flags or removing unwanted fields to be ignored. Here are some examples for exporting and importing:

By simply echoing the object, output can be redirected to a file from console:

php script.php > myschema.json

Use file_put_contents to write to disk. Schema::toJSON accepts all the json_encode options.

Import any of those formats the same way by passing the filename, an array, or a JSON string to the constructor.

Override the internal Schematizer by passing a Schema to the Datum constructor.

You can also pass the Schema override through Source::fetch


Development

The development dependencies can be installed by running composer with or without the --dev option (enabled by default).

composer install --dev

Testing

Tests are written for PHPUnit which is included as a composer dev-dependency. To run the whole test suite, then execute this command from the shell console:

php vendor/bin/phpunit

You might also be interested in an easy to read checklist output:

php vendor/bin/phpunit --testdox

When stepping through breakpoints in an IDE, like Netbeans, it's helpful to see the current test name output by setting the run configuration to debug:

php vendor/bin/phpunit --debug

A code coverage report is available if you have the xdebug extension installed. In addition to the console text summary report, a full HTML GUI is generated to explore in the coverage folder. The easiest way to view this is to boot up a dev server:

php -S localhost:8080 -t coverage/

Database Testing

You must create the file (or symbolic link) tests/data/pdos/pdo.php in order to run the database tests. This should simply return a PDO instance, for example:


All versions of etl with dependencies

PHP Build Version
Package Version
Requires php Version ^7.0
sabre/xml Version ^1.4
jpuck/phpdev Version ^1.3
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package jpuck/etl contains the following files

Loading the files please wait ....