The tap-gnews
extractor
pulls data from
GNews that can
then be sent to a destination using a
loader.
Airbyte Usage Notice
Container-based connectors
can introduce deployment challenges including the potential need to run
Docker-in-Docker (not currently supported by services like AWS ECS, Meltano
Cloud, etc. see
FAQ
and
Airbyte's ECS deployment docs
for more details). Before using this variant we recommend considering if/how
you will be able to deploy container-based connectors to production.
For more context on how this Airbyte integration works please checkout out the
FAQ in the Meltano Docs.
Getting Started
Prerequisites
If you haven't already, follow the initial steps of the Getting Started guide:
Installation and configuration
-
Add the tap-gnews extractor to your
project using
:meltano add
-
Configure the tap-gnews
settings using
:meltano config
-
Test that extractor settings are valid using
:meltano config
meltano add extractor tap-gnews
meltano config tap-gnews set --interactive
meltano config tap-gnews test
Next steps
Follow the remaining steps of the Getting Started guide:
If you run into any issues, learn how to get help.
Capabilities
The current capabilities for
tap-gnews
may have been automatically set when originally added to the Hub. Please review the
capabilities when using this extractor. If you find they are out of date, please
consider updating them by making a pull request to the YAML file that defines the
capabilities for this extractor.
This plugin has the following capabilities:
- about
- catalog
- discover
- schema-flattening
- state
- stream-maps
You can
override these capabilities or specify additional ones
in your meltano.yml
by adding the capabilities
key.
Settings
The
tap-gnews
settings that are known to Meltano are documented below. To quickly
find the setting you're looking for, click on any setting name from the list:
airbyte_config.api_key
airbyte_config.country
airbyte_config.end_date
airbyte_config.in
airbyte_config.language
airbyte_config.nullable
airbyte_config.query
airbyte_config.sortby
airbyte_config.start_date
airbyte_config.top_headlines_query
airbyte_config.top_headlines_topic
airbyte_spec.image
airbyte_spec.tag
docker_mounts
Expand To Show SDK Settings
You can also list these settings using
with the meltano config
list
subcommand:
meltano config tap-gnews list
You can
override these settings or specify additional ones
in your meltano.yml
by adding the settings
key.
Please consider adding any settings you have defined locally to this definition on MeltanoHub by making a pull request to the YAML file that defines the settings for this plugin.
Airbyte Config Api Key (airbyte_config.api_key)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_API_KEY
API Key
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config api_key [value]
Airbyte Config Country (airbyte_config.country)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_COUNTRY
This parameter allows you to specify the country where the news articles returned by the API were published, the contents of the articles are not necessarily related to the specified country. You have to set as value the 2 letters code of the country you want to filter.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config country [value]
Airbyte Config End Date (airbyte_config.end_date)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_END_DATE
This parameter allows you to filter the articles that have a publication date smaller than or equal to the specified value. The date must respect the following format: YYYY-MM-DD hh:mm:ss (in UTC)
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config end_date [value]
Airbyte Config In (airbyte_config.in)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_IN
This parameter allows you to choose in which attributes the keywords are searched. The attributes that can be set are title, description and content. It is possible to combine several attributes.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config in [value]
Airbyte Config Language (airbyte_config.language)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_LANGUAGE
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config language [value]
Airbyte Config Nullable (airbyte_config.nullable)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_NULLABLE
This parameter allows you to specify the attributes that you allow to return null values. The attributes that can be set are title, description and content. It is possible to combine several attributes
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config nullable [value]
Airbyte Config Query (airbyte_config.query)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_QUERY
This parameter allows you to specify your search keywords to find the news articles you are looking for. The keywords will be used to return the most relevant articles. It is possible to use logical operators with keywords. - Phrase Search Operator: This operator allows you to make an exact search. Keywords surrounded by quotation marks are used to search for articles with the exact same keyword sequence. For example the query: "Apple iPhone" will return articles matching at least once this sequence of keywords.
- Logical AND Operator: This operator allows you to make sure that several keywords are all used in the article search. By default the space character acts as an AND operator, it is possible to replace the space character by AND to obtain the same result. For example the query: Apple Microsoft is equivalent to Apple AND Microsoft
- Logical OR Operator: This operator allows you to retrieve articles matching the keyword a or the keyword b. It is important to note that this operator has a higher precedence than the AND operator. For example the query: Apple OR Microsoft will return all articles matching the keyword Apple as well as all articles matching the keyword Microsoft
- Logical NOT Operator: This operator allows you to remove from the results the articles corresponding to the specified keywords. To use it, you need to add NOT in front of each word or phrase surrounded by quotes. For example the query: Apple NOT iPhone will return all articles matching the keyword Apple but not the keyword iPhone
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config query [value]
Airbyte Config Sortby (airbyte_config.sortby)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_SORTBY
This parameter allows you to choose with which type of sorting the articles should be returned. Two values are possible:
- publishedAt = sort by publication date, the articles with the most recent publication date are returned first
- relevance = sort by best match to keywords, the articles with the best match are returned first
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config sortby [value]
Airbyte Config Start Date (airbyte_config.start_date)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_START_DATE
This parameter allows you to filter the articles that have a publication date greater than or equal to the specified value. The date must respect the following format: YYYY-MM-DD hh:mm:ss (in UTC)
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config start_date [value]
Airbyte Config Top Headlines Query (airbyte_config.top_headlines_query)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_TOP_HEADLINES_QUERY
This parameter allows you to specify your search keywords to find the news articles you are looking for. The keywords will be used to return the most relevant articles. It is possible to use logical operators with keywords. - Phrase Search Operator: This operator allows you to make an exact search. Keywords surrounded by quotation marks are used to search for articles with the exact same keyword sequence. For example the query: "Apple iPhone" will return articles matching at least once this sequence of keywords.
- Logical AND Operator: This operator allows you to make sure that several keywords are all used in the article search. By default the space character acts as an AND operator, it is possible to replace the space character by AND to obtain the same result. For example the query: Apple Microsoft is equivalent to Apple AND Microsoft
- Logical OR Operator: This operator allows you to retrieve articles matching the keyword a or the keyword b. It is important to note that this operator has a higher precedence than the AND operator. For example the query: Apple OR Microsoft will return all articles matching the keyword Apple as well as all articles matching the keyword Microsoft
- Logical NOT Operator: This operator allows you to remove from the results the articles corresponding to the specified keywords. To use it, you need to add NOT in front of each word or phrase surrounded by quotes. For example the query: Apple NOT iPhone will return all articles matching the keyword Apple but not the keyword iPhone
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config top_headlines_query [value]
Airbyte Config Top Headlines Topic (airbyte_config.top_headlines_topic)
-
Environment variable:
TAP_GNEWS_AIRBYTE_CONFIG_TOP_HEADLINES_TOPIC
This parameter allows you to change the category for the request.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_config top_headlines_topic [value]
Airbyte Spec Image (airbyte_spec.image)
-
Environment variable:
TAP_GNEWS_AIRBYTE_SPEC_IMAGE
-
Default Value:
airbyte/source-gnews
Airbyte image to run
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_spec image [value]
Airbyte Spec Tag (airbyte_spec.tag)
-
Environment variable:
TAP_GNEWS_AIRBYTE_SPEC_TAG
-
Default Value:
latest
Airbyte image tag
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set airbyte_spec tag [value]
Docker Mounts (docker_mounts)
-
Environment variable:
TAP_GNEWS_DOCKER_MOUNTS
Docker mounts to make available to the Airbyte container. Expects a list of maps containing source, target, and type as is documented in the docker --mount documentation
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set docker_mounts [value]
Expand To Show SDK Settings
Flattening Enabled (flattening_enabled)
-
Environment variable:
TAP_GNEWS_FLATTENING_ENABLED
'True' to enable schema flattening and automatically expand nested properties.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set flattening_enabled [value]
Flattening Max Depth (flattening_max_depth)
-
Environment variable:
TAP_GNEWS_FLATTENING_MAX_DEPTH
The max depth to flatten schemas.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set flattening_max_depth [value]
Stream Map Config (stream_map_config)
-
Environment variable:
TAP_GNEWS_STREAM_MAP_CONFIG
User-defined config values to be used within map expressions.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set stream_map_config [value]
Stream Maps (stream_maps)
-
Environment variable:
TAP_GNEWS_STREAM_MAPS
Config object for stream maps capability. For more information check out Stream Maps.
Configure this setting directly using the following Meltano command:
meltano config tap-gnews set stream_maps [value]
Something missing?
This page is generated from a YAML file that you can contribute changes to.
Edit it on GitHub!Looking for help?
#plugins-general
channel.