Download the PHP package keboola/db-import-export without Composer
On this page you can find all versions of the php package keboola/db-import-export. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download keboola/db-import-export
More information about keboola/db-import-export
Files in keboola/db-import-export
Package db-import-export
Short Description Package allows to import files to Snowflake from multiple cloud storages
License MIT
Informations about the package db-import-export
DB Import export library
Supported operations
- Load/Import csv from
ABS
toSnowflake
orSynapse
- Load/Import csv from
GCS
toBigquery
- Unload/Export table from
Snowflake
orSynapse
toABS
Features
Import
- Full load - destination table is truncated before load
- Incremental load - data are merged
- Primary key dedup for all engines
- Convert empty values to NULL (using convertEmptyValuesToNull option)
Export
- Full unload - destination csv is always rewriten
Development
Docker
Prepare .env
(copy of .env.dist
) and set up AWS keys which has access to keboola-drivers
bucket in order to build this image. Also add this user to group ci-php-import-export-lib
witch will allow you to work with newly created bucket for tests.
User can be created in Dev - Main legacy
, where are also groups for keboola-drivers
and ci-php-import-export-lib
.
If you don't have access to keboola-drivers
you have to change Dockerfile.
- Comment out first stage which downloads Teradata driver and tools and supply own downloaded from Teradata site:
- Tools: https://downloads.teradata.com/download/tools/teradata-tools-and-utilities-linux-installation-package-0
- Driver: https://downloads.teradata.com/download/connectivity/odbc-driver/linux
- Change
COPY --from=td
commands in Dockerfile with copy of you local Teradata packages
Then run docker compose build
The AWS credentials have to also have access to bucket specified in AWS_S3_BUCKET
. This bucket has to contain testing data. Run docker compose run --rm dev composer loadS3
to load them up.
Preparation
Azure
- Create storage account template can be found in provisioning ABS create template
- Create container in storage account
Blob service -> Containers
note: for tests this step can be skiped container is created withloadAbs
cmd -
Fill env variables in .env file
- Upload test fixtures to ABS
docker compose run --rm dev composer loadAbs
Google cloud storage
- Create bucket in GCS set bucket name in .env variable
GCS_BUCKET_NAME
- Create service account in IAM
- In bucket permissions grant service account admin access to bucket
- Create new service account key
- Convert key to string
awk -v RS= '{$1=$1}1' <key_file>.json >> .env
(orcat file.json | jq -c | jq -R
) -
Set content on last line of .env as variable
GCS_CREDENTIALS
- Upload test fixtures to GCS
docker compose run --rm dev composer loadGcs-bigquery
ordocker compose run --rm dev composer loadGcs-snowflake
(depending on backend)
SNOWFLAKE
Role, user, database and warehouse are required for tests. You can create them:
SYNAPSE
Create synapse server on Azure portal or using CLI.
set up env variables: SYNAPSE_UID SYNAPSE_PWD SYNAPSE_DATABASE SYNAPSE_SERVER
Run query:
this will create master key for polybase.
Managed Identity
Managed Identity is required when using ABS in vnet. docs How to setup and use Managed Identity is described in docs
TLDR; In IAM of ABS add role assignment "Blob Storage Data {Reader or Contributor}" to your Synapse server principal
Exasol
You can run Exasol locally in Docker or you can use SaaS.
Exasol locally in Docker
Run Exasol on your local machine in docker (for this case .env is preconfigured)
Run Exasol server somewhere else and set up env variables:
Exasol in SaaS
Login to SaaS UI (or use a local client) and create user with following grants.
Obtain host (with port), username and password (from previous step) and fill it in .env
as desribed above. Make sure, that your account has enabled network for your IP.
Teradata
Prepare Teradata servers on AWS/Azure and set following properties. See
create new database for tests:
Bigquery
Install Google Cloud client (via Brew), initialize it and log in to generate default credentials.
To prepare the backend you can use Terraform template.
You must have the resourcemanager.folders.create
permission for the organization.
Run terraform apply
with following variables:
- folder_id: Go to GCP Resource Manager and select your team dev folder ID (e.g. find 'KBC Team Dev' and copy ID)
- backend_prefix: your_name, all resources will create with this prefix
- billing_account_id: Go to Billing and copy your Billing account ID
For missing pieces see Connection repository. After terraform apply ends go to the service project in folder created by terraform.
- convert key to string and save to
.env
file:awk -v RS= '{$1=$1}1' <key_file>.json >> .env
- set content on the last line of
.env
as variableBQ_KEY_FILE
- set env variable
BQ_BUCKET_NAME
generated from TF templatefile_storage_bucket_id
Tests
Run tests with following command.
note: azure credentials must be provided and fixtures uploaded
Unit and functional test can be run sepparetly
Code quality check
Full CI workflow
This command will run all checks load fixtures and run tests
Usage
Snowflake
ABS -> Snowflake import/load
Snowflake -> Snowflake copy
Snowflake -> ABS export/unload
Synapse next (experimental)
Import to Synapse
Internals/Extending
Library consists of few simple interfaces.
Create new backend
Importer, Exporter Interface must be implemented in new Backed
For each backend there is corresponding adapter which supports own combination of SourceInterface and DestinationInterface. Custom adapters can be set with setAdapters
method.
Create new storage
Storage is now file storage ABS|S3 (in future) or table storage Snowflake|Synapse.
Storage can have Source
and Destination
which must implement SourceInterface
or DestinationInterface
. These interfaces are empty and it's up to adapter to support own combination.
In general there is one Import/Export adapter per FileStorage <=> TableStorage combination.
Adapter must implement:
Keboola\Db\ImportExport\Backend\BackendImportAdapterInterface
for importKeboola\Db\ImportExport\Backend\BackendExportAdapterInterface
for export
Backend can require own extended AdapterInterface (Synapse and Snowflake do now).
License
MIT licensed, see LICENSE file.
All versions of db-import-export with dependencies
ext-json Version *
ext-pdo Version *
doctrine/dbal Version ^3.3
google/cloud-bigquery Version ^1.23
google/cloud-storage Version ^1.27
keboola/csv-options Version ^1
keboola/php-csv-db-import Version ^6
keboola/php-datatypes Version ^7.6
keboola/php-file-storage-utils Version ^0.2.2
keboola/php-temp Version ^2.0
keboola/table-backend-utils Version >=2.7
microsoft/azure-storage-blob Version ^1.4
symfony/process Version ^4.4|^5.0|^6.0