diff --git a/docs/connector-development/config-based/authentication.md b/docs/connector-development/config-based/authentication.md index d855a5aa7c557..ef81ed4f20b63 100644 --- a/docs/connector-development/config-based/authentication.md +++ b/docs/connector-development/config-based/authentication.md @@ -27,7 +27,7 @@ authenticator: token: "hello" ``` -More information on bearer authentication can be found [here](https://swagger.io/docs/specification/authentication/bearer-authentication/) +More information on bearer authentication can be found [here](https://swagger.io/docs/specification/authentication/bearer-authentication/). ### BasicHttpAuthenticator diff --git a/docs/connector-development/config-based/error-handling.md b/docs/connector-development/config-based/error-handling.md index 921f39b872b51..dd8adc275aaea 100644 --- a/docs/connector-development/config-based/error-handling.md +++ b/docs/connector-development/config-based/error-handling.md @@ -36,7 +36,7 @@ requester: ### From error message Errors can also be defined by parsing the error message. -For instance, this error handler will ignores responses if the error message contains the string "ignorethisresponse" +For instance, this error handler will ignore responses if the error message contains the string "ignorethisresponse" ```yaml requester: @@ -72,8 +72,8 @@ requester: response_filters: - http_codes: [ 404 ] action: IGNORE - - http_codes: [ 429 ] - action: RETRY + - http_codes: [ 429 ] + action: RETRY ``` ## Backoff Strategies @@ -148,8 +148,8 @@ requester: backoff_strategies: - type: "WaitTimeFromHeaderBackoffStrategy" header: "wait_time" - - type: "ConstantBackoffStrategy" - backoff_time_in_seconds: 5 + - type: "ConstantBackoffStrategy" + backoff_time_in_seconds: 5 ``` diff --git a/docs/connector-development/config-based/index.md b/docs/connector-development/config-based/index.md index 02d4703ca11b9..03723e1758d8e 100644 --- a/docs/connector-development/config-based/index.md +++ b/docs/connector-development/config-based/index.md @@ -3,7 +3,7 @@ ## From scratch - [Overview](overview.md) -- [Yaml structure](overview.md) +- [Yaml structure](yaml-structure.md) - [Reference docs](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.html) ## Concepts diff --git a/docs/connector-development/config-based/overview.md b/docs/connector-development/config-based/overview.md index bed17143e171f..56e1f78f73edc 100644 --- a/docs/connector-development/config-based/overview.md +++ b/docs/connector-development/config-based/overview.md @@ -51,12 +51,12 @@ A stream generally corresponds to a resource within the API. They are analogous A stream is defined by: 1. A name -2. Primary key (Optional): Used to uniquely identify records, enabling deduplication. Can be a string for single primary keys, a list of strings for composite primary keys, or a list of list of strings for composite primary keys consisting of nested fields. +2. Primary key (Optional): Used to uniquely identify records, enabling deduplication. Can be a string for single primary keys, a list of strings for composite primary keys, or a list of list of strings for composite primary keys consisting of nested fields 3. [Schema](../cdk-python/schemas.md): Describes the data to sync 4. [Data retriever](overview.md#data-retriever): Describes how to retrieve the data from the API 5. [Cursor field](../cdk-python/incremental-stream.md) (Optional): Field to use as stream cursor. Can either be a string, or a list of strings if the cursor is a nested field. 6. [Transformations](./record-selector.md#transformations) (Optional): A set of transformations to be applied on the records read from the source before emitting them to the destination -7. [Checkpoint interval](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/#state--checkpointing) (Optional): Defines the interval, in number of records, at which incremental syncs should be checkpointed. +7. [Checkpoint interval](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/#state--checkpointing) (Optional): Defines the interval, in number of records, at which incremental syncs should be checkpointed More details on streams and sources can be found in the [basic concepts section](../cdk-python/basic-concepts.md). @@ -87,9 +87,11 @@ The `SimpleRetriever`'s data flow can be described as follows: 2. Select the records from the response 3. Repeat for as long as the paginator points to a next page -More details on the record selector can be found in the [record selector section](record-selector.md) -More details on the stream slicers can be found in the [stream slicers section](stream-slicers.md) -More details on the paginator can be found in the [pagination section](pagination.md) +More details on the record selector can be found in the [record selector section](record-selector.md). + +More details on the stream slicers can be found in the [stream slicers section](stream-slicers.md). + +More details on the paginator can be found in the [pagination section](pagination.md). ## Requester @@ -104,7 +106,8 @@ There is currently only one implementation, the `HttpRequester`, which is define 6. An error handler: Defines how to handle errors More details on authentication can be found in the [authentication section](authentication.md). -More details on error handling can be found in the [error handling section](error-handling.md) + +More details on error handling can be found in the [error handling section](error-handling.md). ## Connection Checker diff --git a/docs/connector-development/config-based/pagination.md b/docs/connector-development/config-based/pagination.md index b305fddaa9cce..d77769b7e2b95 100644 --- a/docs/connector-development/config-based/pagination.md +++ b/docs/connector-development/config-based/pagination.md @@ -100,9 +100,10 @@ paginator: ``` Assuming the endpoint to fetch data from is `https://cloud.airbyte.com/api/get_data`, -the first request will be sent as `https://cloud.airbyte.com/api/get_data` +the first request will be sent as `https://cloud.airbyte.com/api/get_data`. + Assuming the id of the last record fetched is 1000, -the next request will be sent as `https://cloud.airbyte.com/api/get_data?from=1000` +the next request will be sent as `https://cloud.airbyte.com/api/get_data?from=1000`. #### Cursor paginator in path @@ -121,5 +122,6 @@ paginator: Assuming the endpoint to fetch data from is `https://cloud.airbyte.com/api/get_data`, the first request will be sent as `https://cloud.airbyte.com/api/get_data` + Assuming the response's next url is `https://cloud.airbyte.com/api/get_data?page=1&page_size=100`, the next request will be sent as `https://cloud.airbyte.com/api/get_data?page=1&page_size=100` \ No newline at end of file diff --git a/docs/connector-development/config-based/request-options.md b/docs/connector-development/config-based/request-options.md index daf6a5069a804..19f42dffece52 100644 --- a/docs/connector-development/config-based/request-options.md +++ b/docs/connector-development/config-based/request-options.md @@ -40,7 +40,7 @@ requester: It is also possible for authenticators to set request parameters or headers as needed. For instance, the `BearerAuthenticator` will always set the `Authorization` header. -More details on the various authenticators can be found in the [authentication section](authentication.md) +More details on the various authenticators can be found in the [authentication section](authentication.md). ## Paginators @@ -63,7 +63,7 @@ paginator: field_name: "page" ``` -More details on paginators can be found in the [pagination section](pagination.md) +More details on paginators can be found in the [pagination section](pagination.md). ## Stream slicers @@ -85,4 +85,4 @@ stream_slicer: inject_into: "request_parameter" ``` -More details on the stream slicers can be found in the [stream-slicers section](stream-slicers.md) +More details on the stream slicers can be found in the [stream-slicers section](stream-slicers.md). diff --git a/docs/connector-development/config-based/stream-slicers.md b/docs/connector-development/config-based/stream-slicers.md index 96f0a5bd0a7e9..5d9ab2f8fee05 100644 --- a/docs/connector-development/config-based/stream-slicers.md +++ b/docs/connector-development/config-based/stream-slicers.md @@ -9,7 +9,7 @@ When a stream is read incrementally, a state message will be output by the conne At the beginning of a `read` operation, the `StreamSlicer` will compute the slices to sync given the connection config and the stream's current state, As the `Retriever` reads data from the `Source`, the `StreamSlicer` keeps track of the `Stream`'s state, which will be emitted after reading each stream slice. -More information of stream slicing can be found in the [stream-slices section](../cdk-python/stream-slices.md) +More information of stream slicing can be found in the [stream-slices section](../cdk-python/stream-slices.md). ## Implementations @@ -56,10 +56,9 @@ If the `cursor_field` is `created`, and the record is `{"id": 1234, "created": " When reading data from the source, the cursor value will be updated to the max datetime between -- the last record's cursor field -- the start of the stream slice -- the current cursor value - This ensures that the cursor will be updated even if a stream slice does not contain any data. +- The last record's cursor field +- The start of the stream slice +- The current cursor value. This ensures that the cursor will be updated even if a stream slice does not contain any data #### Stream slicer on dates @@ -164,7 +163,7 @@ retriever: stream_slice_field: "repository" ``` -[^1] This is a slight oversimplification. See update cursor section for more details on how the cursor is updated +[^1] This is a slight oversimplification. See [update cursor section](#cursor-update) for more details on how the cursor is updated. ## More readings diff --git a/docs/connector-development/config-based/tutorial/0-getting-started.md b/docs/connector-development/config-based/tutorial/0-getting-started.md index 5170b6ad5d26a..4b04d97ee247b 100644 --- a/docs/connector-development/config-based/tutorial/0-getting-started.md +++ b/docs/connector-development/config-based/tutorial/0-getting-started.md @@ -8,7 +8,7 @@ Throughout this tutorial, we'll walk you through the creation an Airbyte source We'll build a connector reading data from the Exchange Rates API, but the steps will apply to other HTTP APIs you might be interested in integrating with. -The API documentations can be found [here](https://exchangeratesapi.io/documentation/). +The API documentations can be found [here](https://apilayer.com/marketplace/exchangerates_data-api). In this tutorial, we will read data from the following endpoints: - `Latest Rates Endpoint` diff --git a/docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md b/docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md index ca6abec250be9..3527630a290da 100644 --- a/docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md +++ b/docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md @@ -11,13 +11,13 @@ Over the course of this tutorial, we'll be editing a few files that were generat We'll also be creating the following files: -- `source-exchange-rates-tutorial/secrets/config.json`: This is the configuration file we'll be using to test the connector. It's schema should match the schema defined in the spec file. -- `source-exchange-rates-tutorial/secrets/invalid_config.json`: This is an invalid configuration file we'll be using to test the connector. It's schema should match the schema defined in the spec file. +- `source-exchange-rates-tutorial/secrets/config.json`: This is the configuration file we'll be using to test the connector. Its schema should match the schema defined in the spec file. +- `source-exchange-rates-tutorial/secrets/invalid_config.json`: This is an invalid configuration file we'll be using to test the connector. Its schema should match the schema defined in the spec file. - `source_exchange_rates_tutorial/schemas/rates.json`: This is the [schema definition](../../cdk-python/schemas.md) for the stream we'll implement. ## Updating the connector spec and config -Let's populate the specification (`spec.yaml`) the configuration (`secrets/config.json), so the connector can access the access key and base currency. +Let's populate the specification (`spec.yaml`) and the configuration (`secrets/config.json`) so the connector can access the access key and base currency. 1. We'll add these properties to the connector spec in `source-exchange-rates-tutorial/source_exchange_rates_tutorial/spec.yaml` @@ -61,9 +61,9 @@ $ echo '{"access_key": "", "base": "USD"}' > secrets/config.js Next, we'll update the connector definition (`source-exchange-rates-tutorial/source_exchange_rates_tutorial/exchange_rates_tutorial.yaml`). It was generated by the code generation script. More details on the connector definition file can be found in the [overview](../overview.md) and [connection definition](../yaml-structure.md) sections. -Let's fill this out these TODOs with the information found in the [Exchange Rates API docs](https://exchangeratesapi.io/documentation/) +Let's fill this out these TODOs with the information found in the [Exchange Rates API docs](https://apilayer.com/marketplace/exchangerates_data-api). -1. First, let's rename the stream from `customers` to `rates`, and update the primary key to `date` +1. First, let's rename the stream from `customers` to `rates`, and update the primary key to `date`. ```yaml streams: @@ -84,7 +84,7 @@ check: Adding the reference in the `check` tells the `check` operation to use that stream to test the connection. 2. Next we'll set the base url. - According to the API documentation, the base url is `"https://api.exchangeratesapi.io/v1/"`. + According to the API documentation, the base url is `"https://api.apilayer.com"`. ```yaml definitions: @@ -141,7 +141,7 @@ definitions: base: "{{ config['base'] }}" ``` -The full connection definition should now look like +The full connector definition should now look like ```yaml version: "0.1.0" diff --git a/docs/connector-development/config-based/tutorial/5-incremental-reads.md b/docs/connector-development/config-based/tutorial/5-incremental-reads.md index 180e09443200c..4fff6c9b22a9b 100644 --- a/docs/connector-development/config-based/tutorial/5-incremental-reads.md +++ b/docs/connector-development/config-based/tutorial/5-incremental-reads.md @@ -87,10 +87,11 @@ For example: > "historical": true, "base": "USD", "date": "2022-07-18" The connector will now always read data for the start date, which is not exactly what we want. -Instead, we would like to iterate over all the dates between the start_date and today and read data for each day. +Instead, we would like to iterate over all the dates between the `start_date` and today and read data for each day. We can do this by adding a `DatetimeStreamSlicer` to the connector definition, and update the `path` to point to the stream_slice's `start_date`: -More details on the stream slicers can be found [here](../stream-slicers.md) + +More details on the stream slicers can be found [here](../stream-slicers.md). Let's first define a stream slicer at the top level of the connector definition: diff --git a/docs/connector-development/config-based/yaml-structure.md b/docs/connector-development/config-based/yaml-structure.md index 623d06a908328..32c9843ca4d2f 100644 --- a/docs/connector-development/config-based/yaml-structure.md +++ b/docs/connector-development/config-based/yaml-structure.md @@ -135,7 +135,7 @@ In this example, outer.inner.k2 will evaluate to "MyKey is MyValue" ## References Strings can contain references to previously defined values. -The parser will dereference these values to produce a complete ConnectionDefinition +The parser will dereference these values to produce a complete object definition. References can be defined using a "*ref({arg})" string. @@ -230,7 +230,7 @@ nested.path: "uh oh" value: "uh oh" ``` -To resolve the ambiguity, we try looking for the reference key at the top level, and then traverse the structs downward +To resolve the ambiguity, we try looking for the reference key at the top-level, and then traverse the structs downward until we find a key with the given path, or until there is nothing to traverse. More details on referencing values can be found [here](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.parsers.html?highlight=yamlparser#airbyte_cdk.sources.declarative.parsers.yaml_parser.YamlParser). @@ -264,7 +264,7 @@ This means that both these string templates will evaluate to the same string: 1. `"{{ options.name }}"` 2. `"{{ options['name'] }}"` -In additional to passing additional values through the kwargs argument, macros can be called from within the string interpolation. +In additional to passing additional values through the $options argument, macros can be called from within the string interpolation. For example, `"{{ max(2, 3) }}" -> 3`