PostgreSQL (transferwise
variant)
Table of Contents
- Alternative variants
- Getting Started
- Settings
-
Host (
host
) -
Port (
port
) -
User (
user
) -
Password (
password
) -
Database Name (
dbname
) -
SSL (
ssl
) -
Default Target Schema (
default_target_schema
) -
Batch Size Rows (
batch_size_rows
) -
Flush All Streams (
flush_all_streams
) -
Parallelism (
parallelism
) -
Max Parallelism (
parallelism_max
) -
Default Target Schema Select Permission (
default_target_schema_select_permission
) -
Schema Mapping (
schema_mapping
) -
Add Metadata Columns (
add_metadata_columns
) -
Hard Delete (
hard_delete
) -
Data Flattening Max Level (
data_flattening_max_level
) -
Primary Key Required (
primary_key_required
) -
Validate Records (
validate_records
) -
Temporary Directory (
temp_dir
)
-
Host (
- Looking for help?
The
target-postgres
Meltano loader
sends data into
PostgreSQL
after it was pulled from a source using an
extractor.
- Repository: https://github.com/transferwise/pipelinewise-target-postgres
- Maintainer: Wise
- Last update at: 26 Apr 2022
- Open issues and PRs : 16
- Stars: 12
-
-
Alternative variants #
Multiple
variants
of target-postgres
are available.
This document describes the transferwise
variant.
Alternative variants are:
Getting Started #
Prerequisites #
If you haven't already, follow the initial steps of the Getting Started guide:
Installation and configuration #
Using the Command Line Interface #
-
Add the
target-postgres
loader to your project usingmeltano add
:meltano add loader target-postgres --variant transferwise
-
Configure the settings below using
meltano config
.
Using Meltano UI #
-
Start Meltano UI using
meltano ui
:meltano ui
- Open the Loaders interface at http://localhost:5000/loaders.
- Click the “Add to project” button for “PostgreSQL”.
- Choose "Add variant 'transferwise'".
- Configure the settings below in the “Configuration” interface that opens automatically.
Next steps #
Follow the remaining steps of the Getting Started guide:
If you run into any issues, learn how to get help.
Settings #
target-postgres
requires the
configuration
of the following settings:
These and other supported settings are documented below. To quickly find the setting you're looking for, use the Table of Contents at the top of the page.
Host (host
)
#
-
Environment variable:
TARGET_POSTGRES_HOST
, alias:PG_ADDRESS
- Default:
localhost
PostgreSQL host
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set host <host>
export TARGET_POSTGRES_HOST=<host>
Port (port
)
#
-
Environment variable:
TARGET_POSTGRES_PORT
, alias:PG_PORT
- Default:
5432
PostgreSQL port
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set port 5432
export TARGET_POSTGRES_PORT=5432
User (user
)
#
-
Environment variable:
TARGET_POSTGRES_USER
, alias:PG_USERNAME
PostgreSQL user
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set user <user>
export TARGET_POSTGRES_USER=<user>
Password (password
)
#
-
Environment variable:
TARGET_POSTGRES_PASSWORD
, alias:PG_PASSWORD
PostgreSQL password
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set password <password>
export TARGET_POSTGRES_PASSWORD=<password>
Database Name (dbname
)
#
-
Environment variable:
TARGET_POSTGRES_DBNAME
, alias:PG_DATABASE
PostgreSQL database name
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set dbname <dbname>
export TARGET_POSTGRES_DBNAME=<dbname>
SSL (ssl
)
#
-
Environment variable:
TARGET_POSTGRES_SSL
- Default:
false
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set ssl true
export TARGET_POSTGRES_SSL=true
Default Target Schema (default_target_schema
)
#
-
Environment variable:
TARGET_POSTGRES_DEFAULT_TARGET_SCHEMA
, alias:TARGET_POSTGRES_SCHEMA
PG_SCHEMA
- Default:
$MELTANO_EXTRACT__LOAD_SCHEMA
Name of the schema where the tables will be created. If schema_mapping
is not defined then every stream sent by the tap is loaded into this schema.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set default_target_schema <default_target_schema>
export TARGET_POSTGRES_DEFAULT_TARGET_SCHEMA=<default_target_schema>
Batch Size Rows (batch_size_rows
)
#
-
Environment variable:
TARGET_POSTGRES_BATCH_SIZE_ROWS
- Default:
100000
Maximum number of rows in each batch. At the end of each batch, the rows in the batch are loaded into Postgres.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set batch_size_rows 100000
export TARGET_POSTGRES_BATCH_SIZE_ROWS=100000
Flush All Streams (flush_all_streams
)
#
-
Environment variable:
TARGET_POSTGRES_FLUSH_ALL_STREAMS
- Default:
false
Flush and load every stream into Postgres when one batch is full. Warning: This may trigger the COPY command to use files with low number of records.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set flush_all_streams true
export TARGET_POSTGRES_FLUSH_ALL_STREAMS=true
Parallelism (parallelism
)
#
-
Environment variable:
TARGET_POSTGRES_PARALLELISM
- Default:
0
The number of threads used to flush tables. 0 will create a thread for each stream, up to parallelism_max. -1 will create a thread for each CPU core. Any other positive number will create that number of threads, up to parallelism_max.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set parallelism 0
export TARGET_POSTGRES_PARALLELISM=0
Max Parallelism (parallelism_max
)
#
-
Environment variable:
TARGET_POSTGRES_PARALLELISM_MAX
- Default:
16
Max number of parallel threads to use when flushing tables.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set parallelism_max 16
export TARGET_POSTGRES_PARALLELISM_MAX=16
Default Target Schema Select Permission (default_target_schema_select_permission
)
#
-
Environment variable:
TARGET_POSTGRES_DEFAULT_TARGET_SCHEMA_SELECT_PERMISSION
Grant USAGE privilege on newly created schemas and grant SELECT privilege on newly created tables to a specific role or a list of roles. If schema_mapping
is not defined then every stream sent by the tap is granted accordingly.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set default_target_schema_select_permission <default_target_schema_select_permission>
export TARGET_POSTGRES_DEFAULT_TARGET_SCHEMA_SELECT_PERMISSION=<default_target_schema_select_permission>
Schema Mapping (schema_mapping
)
#
-
Environment variable:
TARGET_POSTGRES_SCHEMA_MAPPING
Useful if you want to load multiple streams from one tap to multiple Postgres schemas.
If the tap sends the stream_id
in <schema_name>-<table_name>
format then this option overwrites the default_target_schema
value. Note, that using schema_mapping
you can overwrite the default_target_schema_select_permission
value to grant SELECT permissions to different groups per schemas or optionally you can create indices automatically for the replicated tables.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set schema_mapping '{...}'
export TARGET_POSTGRES_SCHEMA_MAPPING='{...}'
Add Metadata Columns (add_metadata_columns
)
#
-
Environment variable:
TARGET_POSTGRES_ADD_METADATA_COLUMNS
- Default:
false
Metadata columns add extra row level information about data ingestions, (i.e. when was the row read in source, when was inserted or deleted in postgres etc.) Metadata columns are creating automatically by adding extra columns to the tables with a column prefix _SDC_
. The column names are following the stitch naming conventions documented at https://www.stitchdata.com/docs/data-structure/integration-schemas#sdc-columns. Enabling metadata columns will flag the deleted rows by setting the _SDC_DELETED_AT
metadata column. Without the add_metadata_columns
option the deleted rows from singer taps will not be recongisable in Postgres.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set add_metadata_columns true
export TARGET_POSTGRES_ADD_METADATA_COLUMNS=true
Hard Delete (hard_delete
)
#
-
Environment variable:
TARGET_POSTGRES_HARD_DELETE
- Default:
false
When hard_delete
option is true then DELETE SQL commands will be performed in Postgres to delete rows in tables. It’s achieved by continuously checking the _SDC_DELETED_AT
metadata column sent by the singer tap. Due to deleting rows requires metadata columns, hard_delete
option automatically enables the add_metadata_columns
option as well.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set hard_delete true
export TARGET_POSTGRES_HARD_DELETE=true
Data Flattening Max Level (data_flattening_max_level
)
#
-
Environment variable:
TARGET_POSTGRES_DATA_FLATTENING_MAX_LEVEL
- Default:
0
Object type RECORD items from taps can be transformed to flattened columns by creating columns automatically. When value is 0 (default) then flattening functionality is turned off.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set data_flattening_max_level 0
export TARGET_POSTGRES_DATA_FLATTENING_MAX_LEVEL=0
Primary Key Required (primary_key_required
)
#
-
Environment variable:
TARGET_POSTGRES_PRIMARY_KEY_REQUIRED
- Default:
true
Log based and Incremental replications on tables with no Primary Key cause duplicates when merging UPDATE events. When set to true, stop loading data if no Primary Key is defined.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set primary_key_required false
export TARGET_POSTGRES_PRIMARY_KEY_REQUIRED=false
Validate Records (validate_records
)
#
-
Environment variable:
TARGET_POSTGRES_VALIDATE_RECORDS
- Default:
false
Validate every single record message to the corresponding JSON schema. This option is disabled by default and invalid RECORD messages will fail only at load time by Postgres. Enabling this option will detect invalid records earlier but could cause performance degradation.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set validate_records true
export TARGET_POSTGRES_VALIDATE_RECORDS=true
Temporary Directory (temp_dir
)
#
-
Environment variable:
TARGET_POSTGRES_TEMP_DIR
(Default: platform-dependent) Directory of temporary CSV files with RECORD messages.
How to use #
Manage this setting using Meltano UI,
meltano config
, or an
environment variable:
meltano config target-postgres set temp_dir <temp_dir>
export TARGET_POSTGRES_TEMP_DIR=<temp_dir>
Looking for help? #
If you're having trouble getting the
target-postgres
loader to work, look for an
existing issue in its repository, file a new issue,
or
join the Meltano Slack community
and ask for help in the #plugins-general
channel.
Found an issue on this page? #
This page is generated from a YAML file that you can contribute changes to!