The target-bigquery loader sends data into Google BigQuery after it was pulled from a source using an extractor
Available Variants
Getting Started
Prerequisites
If you haven't already, follow the initial steps of the Getting Started guide:
Installation and configuration
-
Add the target-bigquery loader to your project
using
:meltano add
-
Configure the target-bigquery settings using
:meltano config
meltano add loader target-bigquery --variant z3z1ma
meltano config target-bigquery set --interactive
Next steps
Follow the remaining steps of the Getting Started guide:
If you run into any issues, learn how to get help.
Capabilities
The current capabilities for
target-bigquery
may have been automatically set when originally added to the Hub. Please review the
capabilities when using this loader. If you find they are out of date, please
consider updating them by making a pull request to the YAML file that defines the
capabilities for this loader.
This plugin has the following capabilities:
- about
- stream-maps
- schema-flattening
You can
override these capabilities or specify additional ones
in your meltano.yml
by adding the capabilities
key.
Settings
The
target-bigquery
settings that are known to Meltano are documented below. To quickly
find the setting you're looking for, click on any setting name from the list:
credentials_path
credentials_json
project
dataset
batch_size
timeout
denormalized
method
generate_view
bucket
partition_granularity
cluster_on_key_properties
column_name_transforms.lower
column_name_transforms.quote
column_name_transforms.add_underscore_when_invalid
column_name_transforms.snake_case
options.storage_write_batch_mode
options.process_pool
options.max_workers
stream_maps
stream_map_config
flattening_enabled
flattening_max_depth
You can also list these settings using
with the meltano config
list
subcommand:
meltano config target-bigquery list
You can
override these settings or specify additional ones
in your meltano.yml
by adding the settings
key.
Please consider adding any settings you have defined locally to this definition on MeltanoHub by making a pull request to the YAML file that defines the settings for this plugin.
Credentials Path (credentials_path)
-
Environment variable:
TARGET_BIGQUERY_CREDENTIALS_PATH
The path to a gcp credentials json file.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set credentials_path [value]
Credentials Json (credentials_json)
-
Environment variable:
TARGET_BIGQUERY_CREDENTIALS_JSON
A JSON string of your service account JSON file.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set credentials_json [value]
Project (project)
-
Environment variable:
TARGET_BIGQUERY_PROJECT
The target GCP project to materialize data into.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set project [value]
Dataset (dataset)
-
Environment variable:
TARGET_BIGQUERY_DATASET
The target dataset to materialize data into.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set dataset [value]
Batch Size (batch_size)
-
Environment variable:
TARGET_BIGQUERY_BATCH_SIZE
The maximum number of rows to send in a single batch or commit.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set batch_size [value]
Timeout (timeout)
-
Environment variable:
TARGET_BIGQUERY_TIMEOUT
Default timeout for batch_job and gcs_stage derived LoadJobs.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set timeout [value]
Denormalized (denormalized)
-
Environment variable:
TARGET_BIGQUERY_DENORMALIZED
Determines whether to denormalize the data before writing to BigQuery. A false value will write data using a fixed JSON column based schema, while a true value will write data using a dynamic schema derived from the tap. Denormalization is only supported for the batch_job, streaming_insert, and gcs_stage methods.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set denormalized [value]
Method (method)
-
Environment variable:
TARGET_BIGQUERY_METHOD
The method to use for writing to BigQuery.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set method [value]
Generate View (generate_view)
-
Environment variable:
TARGET_BIGQUERY_GENERATE_VIEW
Determines whether to generate a view based on the SCHEMA message parsed from the tap. Only valid if denormalized=false meaning you are using the fixed JSON column based schema.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set generate_view [value]
Bucket (bucket)
-
Environment variable:
TARGET_BIGQUERY_BUCKET
The GCS bucket to use for staging data. Only used if method is gcs_stage.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set bucket [value]
Partition Granularity (partition_granularity)
-
Environment variable:
TARGET_BIGQUERY_PARTITION_GRANULARITY
The granularity of the partitioning strategy. Defaults to month.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set partition_granularity [value]
Cluster On Key Properties (cluster_on_key_properties)
-
Environment variable:
TARGET_BIGQUERY_CLUSTER_ON_KEY_PROPERTIES
Determines whether to cluster on the key properties from the tap. Defaults to false.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set cluster_on_key_properties [value]
Column Name Transforms Lower (column_name_transforms.lower)
-
Environment variable:
TARGET_BIGQUERY_COLUMN_NAME_TRANSFORMS_LOWER
Lowercase column names
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set column_name_transforms lower [value]
Column Name Transforms Quote (column_name_transforms.quote)
-
Environment variable:
TARGET_BIGQUERY_COLUMN_NAME_TRANSFORMS_QUOTE
Quote columns during DDL generation
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set column_name_transforms quote [value]
Column Name Transforms Add Underscore When Invalid (column_name_transforms.add_underscore_when_invalid)
-
Environment variable:
TARGET_BIGQUERY_COLUMN_NAME_TRANSFORMS_ADD_UNDERSCORE_WHEN_INVALID
Add an underscore when a column starts with a digit
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set column_name_transforms add_underscore_when_invalid [value]
Column Name Transforms Snake Case (column_name_transforms.snake_case)
-
Environment variable:
TARGET_BIGQUERY_COLUMN_NAME_TRANSFORMS_SNAKE_CASE
Convert columns to snake case
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set column_name_transforms snake_case [value]
Options Storage Write Batch Mode (options.storage_write_batch_mode)
-
Environment variable:
TARGET_BIGQUERY_OPTIONS_STORAGE_WRITE_BATCH_MODE
By default, we use the default stream (Committed mode) in the storage_write_api load method which results in streaming records which are immediately available and is generally fastest. If this is set to true, we will use the application created streams (Committed mode) to transactionally batch data on STATE messages and at end of pipe.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set options storage_write_batch_mode [value]
Options Process Pool (options.process_pool)
-
Environment variable:
TARGET_BIGQUERY_OPTIONS_PROCESS_POOL
By default we use an autoscaling threadpool to write to BigQuery. If set to true, we will use a process pool.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set options process_pool [value]
Options Max Workers (options.max_workers)
-
Environment variable:
TARGET_BIGQUERY_OPTIONS_MAX_WORKERS
By default, each sink type has a preconfigured max worker limit. This sets an override for maximum number of workers per stream.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set options max_workers [value]
Stream Maps (stream_maps)
-
Environment variable:
TARGET_BIGQUERY_STREAM_MAPS
Config object for stream maps capability. For more information check out Stream Maps.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set stream_maps [value]
Stream Map Config (stream_map_config)
-
Environment variable:
TARGET_BIGQUERY_STREAM_MAP_CONFIG
User-defined config values to be used within map expressions.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set stream_map_config [value]
Flattening Enabled (flattening_enabled)
-
Environment variable:
TARGET_BIGQUERY_FLATTENING_ENABLED
'True' to enable schema flattening and automatically expand nested properties.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set flattening_enabled [value]
Flattening Max Depth (flattening_max_depth)
-
Environment variable:
TARGET_BIGQUERY_FLATTENING_MAX_DEPTH
The max depth to flatten schemas.
Configure this setting directly using the following Meltano command:
meltano config target-bigquery set flattening_max_depth [value]
Something missing?
This page is generated from a YAML file that you can contribute changes to.
Edit it on GitHub!Looking for help?
#plugins-general
channel.