The tap-spreadsheets-anywhere extractor pulls data from CSV files and Excel spreadsheets on cloud or local storage.

Getting Started

Prerequisites

If you haven’t already, follow the initial steps of the Getting Started guide:

  1. Install Meltano
  2. Create your Meltano project

Installation and configuration

  1. Add the tap-spreadsheets-anywhere extractor to your project using meltano add:

     meltano add extractor tap-spreadsheets-anywhere
    
  2. Configure the settings below using meltano config.

Next steps

Follow the remaining steps of the Getting Started guide:

  1. Select entities and attributes to extract
  2. Add a loader to send data to a destination
  3. Run a data integration (EL) pipeline

If you run into any issues, learn how to get help.

Settings

tap-spreadsheets-anywhere requires the configuration of the following settings:

Minimal configuration

A minimal configuration of tap-spreadsheets-anywhere in your meltano.yml project file will look like this:

plugins:
  extractors:
  - name: tap-spreadsheets-anywhere
    variant: etc
    config:
      tables:
        - path: s3://my-s3-bucket
          name: target_table_name
          pattern: "subfolder/common_prefix.*"
          start_date: "2017-05-01T00:00:00Z"
          key_properties: []
          format: csv
        - path: file:///home/user/Downloads/xls_files
          name: another_table_name
          pattern: "subdir/.*User.*"
          start_date: "2017-05-01T00:00:00Z"
          key_properties: [id]
          format: excel
          worksheet_name: Names

Tables

Array holding objects that each describe a set of targeted source files.

See https://github.com/ets/tap-spreadsheets-anywhere#configuration.

How to use

Manage this setting directly in your meltano.yml project file:

plugins:
  extractors:
  - name: tap-spreadsheets-anywhere
    variant: etc
    config:
      tables:
        - path: <path>
          name: <table_name>
          pattern: "<pattern>"
          start_date: "YYYY-MM-DDTHH:MM:SSZ"
          key_properties: [<key>]
          format: <csv|excel>
        # ...

Alternatively, manage this setting using meltano config or an environment variable:

meltano config tap-spreadsheets-anywhere set tables '[{"path": "<path>", ...}, ...]'

export TAP_SPREADSHEETS_ANYWHERE_TABLES='[{"path": "<path>", ...}, ...]'