Download the PHP package ddeboer/data-import without Composer
On this page you can find all versions of the php package ddeboer/data-import. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Informations about the package data-import
Ddeboer Data Import library
This library has been renamed to PortPHP and will be deprecated. Please use PortPHP instead.
Introduction
This PHP library offers a way to read data from, and write data to, a range of file formats and media. Additionally, it includes tools to manipulate your data.
Features
- Read from and write to CSV files, Excel files, databases, and more.
- Convert between charsets, dates, strings and objects on the fly.
- Build reusable and extensible import workflows.
- Decoupled components that you can use on their own, such as a CSV and Excel reader and writer.
- Well-tested code.
Documentation
- Installation
- Usage
- The workflow
- The workflow result
- Readers
- ArrayReader
- CsvReader
- DbalReader
- DoctrineReader
- ExcelReader
- One To Many Reader
- Create a reader
- Writers
- ArrayWriter
- CsvWriter
- DoctrineWriter
- PdoWriter
- ExcelWriter
- ConsoleTableWriter
- ConsoleProgressWriter
- CallbackWriter
- AbstractStreamWriter
- StreamMergeWriter
- Create a writer
- Filters
- CallbackFilter
- OffsetFilter
- DateTimeThresholdFilter
- ValidatorFilter
- Converters
- Item converters
- MappingItemConverter
- Create an item converter
- CallbackItemConverter
- Value converters
- StringToDateTimeValueConverter
- DateTimeToStringValueConverter
- ObjectConverter
- StringToObjectConverter
- ArrayValueConverterMap
- CallbackValueConverter
- MappingValueConverter
- Examples
- Import CSV file and write to database
- Export to CSV file
- Running the tests
- License
Installation
This library is available on Packagist. The recommended way to install it is through Composer:
Then include Composer’s autoloader:
For integration with Symfony2 projects, the DdeboerDataImportBundle is available.
Usage
Broadly speaking, you can use this library in two ways:
- organize your import process around a workflow, or
- use one or more of the components on their own, such as readers, converters.
The workflow
Each data import revolves around the workflow and takes place along the following lines:
- Construct a reader.
- Construct a workflow and pass the reader to it, optionally pass a logger as second argument. Add at least one writer to the workflow.
- Optionally, add filters, item converters and value converters to the workflow.
- Process the workflow. This will read the data from the reader, filter and
convert the data, and write the output to each of the writers. The process method also
returns a
Result
object which contains various information about the import.
In other words, the workflow acts as a mediator between a reader and one or more writers, filters and converters.
Optionally you can skip items on failure like this $workflow->setSkipItemOnFailure(true)
.
Errors will be logged if you have passed a logger to the workflow constructor.
Schematically:
The workflow Result
The Workflow Result object exposes various methods which you can use to decide what to do after an import.
The result will be an instance of Ddeboer\DataImport\Result
. It is automatically created and populated by the
Workflow
. It will be returned to you after calling the process()
method on the Workflow
The Result
provides the following methods:
Example use cases:
- You want to send an e-mail with the results of the import
- You want to send a Text alert if a particular file failed
- You want to move an import file to a failed directory if there were errors
- You want to log how long imports are taking
Readers
Readers read data that will be imported by iterating over it. This library includes a handful of readers. Additionally, you can easily implement your own.
You can use readers on their own, or construct a workflow from them:
ArrayReader
Reads arrays. Most useful for testing your workflow.
CsvReader
Reads CSV files, and is optimized to use as little memory as possible.
Optionally construct with different delimiter, enclosure and/or escape character:
Then iterate over the CSV file:
Column headers
If one of your rows contains column headers, you can read them to make the rows associative arrays:
Strict mode
The CSV reader operates in strict mode by default. If the reader encounters a
row where the number of values differs from the number of column headers, an
error is logged and the row is skipped. Retrieve the errors with getErrors()
.
To disable strict mode, set $reader->setStrict(false)
after you instantiate
the reader.
Disabling strict mode means:
- Any rows that contain fewer values than the column headers are simply padded with null values.
- Any additional values in a row that contain more values than the column headers are ignored.
Examples where this is useful:
- Outlook 2010: which omits trailing blank values
- Google Contacts: which exports more values than there are column headers
Duplicate headers
Sometimes a CSV file contains duplicate column headers, for instance:
id | details | details |
---|---|---|
1 | bla | more bla |
By default, a DuplicateHeadersException
will be thrown if you call
setHeaderRowNumber(0)
on this file. You can handle duplicate columns in
one of three ways:
- call
setColumnHeaders(['id', 'details', 'details_2'])
to specify your own headers - call
setHeaderRowNumber
with theCsvReader::DUPLICATE_HEADERS_INCREMENT
flag to generate incremented headers; in this case:id
,details
anddetails1
- call
setHeaderRowNumber
with theCsvReader::DUPLICATE_HEADERS_MERGE
flag to merge duplicate values into arrays; in this case, the first row’s values will become:[ 'id' => 1, 'details' => [ 'bla', 'more bla' ] ]
.
DbalReader
Reads data through Doctrine’s DBAL. Your project should include Doctrine’s DBAL package:
DoctrineReader
Reads data through the Doctrine ORM:
ExcelReader
Acts as an adapter for the PHPExcel library. Make sure to include that library in your project:
Then use the reader to open an Excel file:
To set the row number that headers will be read from, pass a number as the second argument.
To read the specific sheet:
OneToManyReader
Allows for merging of two data sources (using existing readers), for example you have one CSV with orders and another with order items.
Imagine two CSV's like the following:
You want to associate the items to the order. Using the OneToMany reader we can nest these rows in the order using a key which you specify in the OneToManyReader.
The code would look something like:
The third parameter is the key which the order item data will be nested under. This will be an array of order items. The fourth and fifth parameters are "primary" and "foreign" keys of the data. The OneToMany reader will try to match the data using these keys. Take for example the CSV's given above, you would expect that Order "1" has the first 2 Order Items associated to it due to their Order Id's also being "1".
Note: You can omit the last parameter, if both files have the same field. Eg if parameter 4 is 'OrderId' and you don't specify parameter 5, the reader will look for the foreign key using 'OrderId'
The resulting data will look like:
Create a reader
You can create your own data reader by implementing the Reader Interface.
Writers
ArrayWriter
Resembles the ArrayReader. Probably most useful for testing your workflow.
CsvWriter
Writes CSV files:
DoctrineWriter
Writes data through Doctrine:
By default, DoctrineWriter will truncate your data before running the workflow.
Call disableTruncate()
if you don't want this.
If you are not truncating data, DoctrineWriter will try to find an entity having it's primary key set to the value of the first column of the item. If it finds one, the entity will be updated, otherwise it's inserted. You can tell DoctrineWriter to lookup the entity using different columns of your item by passing a third parameter to it's constructor.
or
The DoctrineWriter will also search out associations automatically and link them by an entity reference. For example suppose you have a Product entity that you are importing and must be associated to a Category. If there is a field in the import file named 'Category' with an id, the writer will use metadata to get the association class and create a reference so that it can be associated properly. The DoctrineWriter will skip any association fields that are already objects in cases where a converter was used to retrieve the association.
PdoWriter
Use the PDO writer for importing data into a relational database (such as MySQL, SQLite or MS SQL) without using Doctrine.
ExcelWriter
Writes data to an Excel file. It requires the PHPExcel package:
You can specify the name of the sheet to write to:
You can open an already existing file and add a sheet to it:
If you wish to overwrite an existing sheet instead, specify the name of the existing sheet:
ConsoleTableWriter
This writer displays items as table on console output for debug purposes when you start the workflow from the command-line. It requires Symfony’s Console component 2.5 or higher:
ConsoleProgressWriter
This writer displays import progress when you start the workflow from the command-line. It requires Symfony’s Console component:
There are various optional arguments you can pass to the ConsoleProgressWriter
. These include the output format and
the redraw frequency. You can read more about the options here.
You might want to set the redraw rate higher than the default as it can slow down the import/export process quite a bit
as it will update the console text after every record has been processed by the Workflow
.
Above we set the output format to 'debug' and the redraw rate to 100. This will only re-draw the console progress text after every 100 records.
The debug
format is default as it displays ETA's and Memory Usage. You can use a more simple formatter if you wish:
CallbackWriter
Instead of implementing your own writer, you can use the quick solution the CallbackWriter offers:
AbstractStreamWriter
Instead of implementing your own writer from scratch, you can use AbstractStreamWriter as a basis, implemented the method and you're done:
StreamMergeWriter
Suppose you have 2 stream writers handling fields differently according to one of the fields. You should then use to call the appropriate Writer for you.
The default field name is but could be changed with the method.
Create a writer
Build your own writer by implementing the Writer Interface.
Filters
A filter decides whether data input is accepted into the import process.
CallbackFilter
The CallbackFilter wraps your callback function that determines whether
data should be accepted. The data input is accepted only if the function
returns true
.
OffsetFilter
OffsetFilter allows you to
- skip a certain amount of items from the beginning
- process only specified amount of items (and skip the rest)
You can combine these two parameters to process a slice from the middle of the data, like rows 5-7 of a CSV file with ten rows.
OffsetFilter is configured by its constructor:
new OffsetFilter($offset = 0, $limit = null)
. Note: $offset
is a 0-based index.
DateTimeThresholdFilter
This filter is useful if you want to do incremental imports. Specify a threshold
DateTime
instance, a column name (defaults to updated_at
), and a
DateTimeValueConverter
that will be used to convert values read from the
filtered items. The item strictly older than the threshold will be discarded.
ValidatorFilter
It’s a common use case to validate the data before you save it to the database. This is exactly what the ValidatorFilter does. To use it, include Symfony’s Validator component in your project:
The ValidatorFilter works as follows:
The default behaviour for the validator is to collect all violations and skip
each invalid row. If you want to stop on the first failing row you can call
ValidatorFilter::throwExceptions()
, which throws a
ValidationException
containing the line number and the violation list.
Item converters
MappingItemConverter
Use the MappingItemConverter to add mappings to your workflow. Your keys from the input data will be renamed according to these mappings. Say you have input data:
You can map the keys foo
and baz
in the following way:
Your output data will now be:
NestedMappingItemConverter
Use the NestedMappingItemConverter to add mappings to your workflow if the input data contains nested arrays. Your keys from the input data will be renamed according to these mappings. Say you have input data:
You can map the keys another
in the following way.
Your output data will now be:
Create an item converter
Implement ItemConverterInterface
to create your own item converter:
CallbackItemConverter
Instead of implementing your own item converter, you can use a callback:
Value converters
Value converters are used to convert specific fields (e.g., columns in database).
DateTimeValueConverter
There are two uses for the DateTimeValueConverter:
- Convert a date representation in a format you specify into a
DateTime
object. - Convert a date representation in a format you specify into a different format.
Convert a date into a DateTime
object.
If your date string is in a format specified at: http://www.php.net/manual/en/datetime.formats.date.php then you can omit the format parameter.
Convert a date string into a differently formatted date string.
If your date is in a format specified at: http://www.php.net/manual/en/datetime.formats.date.php you can pass null
as the first argument.
DateTimeToStringValueConverter
The main use of DateTimeToStringValueConverter is to convert DateTime object into it's string representation in proper format. Default format is 'Y-m-d H:i:s';
ObjectConverter
Converts an object into a scalar value. To use this converter, you must include Symfony’s PropertyAccess component in your project:
Using __toString()
If your object has a __toString()
method, that value will be used:
Using object accessors
If your object has no __toString()
method, its accessors will be called
instead:
StringToObjectConverter
Looks up an object in the database based on a string value:
CallbackValueConverter
Use this if you want to save the trouble of writing a dedicating class:
MappingValueConverter
Looks for a key in a hash you must provide in the constructor:
Examples
Import CSV file and write to database
This example shows how you can read data from a CSV file and write that to the database.
Assume we have the following CSV file:
And we want to write this data to a Doctrine entity:
Then you can import the CSV and save it as your entity in the following way.
Export to CSV file
This example shows how you can export data to a CSV file.
This will write a CSV file output.csv
where the first names are capitalized:
ArrayValueConverterMap
The ArrayValueConverterMap is used to filter values of a multi-level array.
The converters defined in the list are applied on every data-item's value that match the defined array_keys.
Running the tests
Clone this repository:
Install dev dependencies:
And run PHPUnit:
License
DataImport is released under the MIT license. See the LICENSE file for details.