Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destination-s3 use instanceprofile if credentials are not provided #9399

Merged
merged 18 commits into from
Jan 14, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions airbyte-integrations/connectors/destination-s3/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ As a community contributor, you will need access to AWS to run the integration t

- Create an S3 bucket for testing.
- Get your `access_key_id` and `secret_access_key` that can read and write to the above bucket.
- if you leave `access_key_id` and `secret_access_key` in blank, the authentication will rely on the instance profile authentication
- Paste the bucket and key information into the config files under [`./sample_secrets`](./sample_secrets).
- Rename the directory from `sample_secrets` to `secrets`.
- Feel free to modify the config files with different settings in the acceptance test file (e.g. `S3CsvDestinationAcceptanceTest.java`, method `getFormatConfig`), as long as they follow the schema defined in [spec.json](src/main/resources/spec.json).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

package io.airbyte.integrations.destination.s3;

import com.amazonaws.auth.InstanceProfileCredentialsProvider;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
Expand Down Expand Up @@ -87,8 +88,8 @@ public static S3DestinationConfig getS3DestinationConfig(final JsonNode config)
config.get("s3_bucket_name").asText(),
bucketPath,
config.get("s3_bucket_region").asText(),
config.get("access_key_id").asText(),
config.get("secret_access_key").asText(),
config.get("access_key_id") == null ? "" : config.get("access_key_id").asText(),
config.get("secret_access_key") == null ? "" : config.get("secret_access_key").asText(),
partSize,
format);
}
Expand Down Expand Up @@ -128,7 +129,18 @@ public S3FormatConfig getFormatConfig() {
public AmazonS3 getS3Client() {
final AWSCredentials awsCreds = new BasicAWSCredentials(accessKeyId, secretAccessKey);

if (endpoint == null || endpoint.isEmpty()) {
if (accessKeyId.isEmpty() && !secretAccessKey.isEmpty()
|| !accessKeyId.isEmpty() && secretAccessKey.isEmpty()) {
throw new RuntimeException("Either both accessKeyId and secretAccessKey should be provided, or neither");
}

if (accessKeyId.isEmpty() && secretAccessKey.isEmpty()) {
return AmazonS3ClientBuilder.standard()
.withCredentials(new InstanceProfileCredentialsProvider(false))
.build();
}

else if (endpoint == null || endpoint.isEmpty()) {
return AmazonS3ClientBuilder.standard()
.withCredentials(new AWSStaticCredentialsProvider(awsCreds))
.withRegion(bucketRegion)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@
"s3_bucket_name",
"s3_bucket_path",
"s3_bucket_region",
"access_key_id",
"secret_access_key",
Comment on lines -15 to -16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada can you check this PR adding the option to use S3 destination with InstanceProfile. No additional change in UI, but how users can experience errors from the front-end. What do you think? Should add an option to select the connection method => (credentials or instance profile)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the user-experience will not suffer,

if access_key_id AND secret_access_key are not provided -> instanceprofile auth
if access_key_id OR secret_access_key are not provided -> standard authentication error
the rest will stay the same.

"format"
],
"additionalProperties": false,
Expand Down Expand Up @@ -72,14 +70,14 @@
},
"access_key_id": {
"type": "string",
"description": "The access key id to access the S3 bucket. Airbyte requires Read and Write permissions to the given bucket.",
"description": "The access key id to access the S3 bucket. Airbyte requires Read and Write permissions to the given bucket, if not set, Airbyte will rely on Instance Profile.",
"title": "S3 Key Id",
"airbyte_secret": true,
"examples": ["A012345678910EXAMPLE"]
},
"secret_access_key": {
"type": "string",
"description": "The corresponding secret to the access key id.",
"description": "The corresponding secret to the access key id, if S3 Key Id is set, then S3 Access Key must also be provided",
"title": "S3 Access Key",
"airbyte_secret": true,
"examples": ["a012345678910ABCDEFGH/AbCdEfGhEXAMPLEKEY"]
Expand Down
6 changes: 5 additions & 1 deletion docs/integrations/destinations/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ Under the hood, an Airbyte data stream in Json schema is first converted to an A
#### Requirements

1. Allow connections from Airbyte server to your AWS S3/ Minio S3 cluster \(if they exist in separate VPCs\).
2. An S3 bucket with credentials.
2. An S3 bucket with credentials or an instanceprofile with read/write permissions configured for the host (ec2, eks).

#### Setup Guide

Expand All @@ -211,18 +211,22 @@ Under the hood, an Airbyte data stream in Json schema is first converted to an A
* **S3 Bucket Region**
* **Access Key Id**
* See [this](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) on how to generate an access key.
* See [this](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html) on how to create a instanceprofile.
* We recommend creating an Airbyte-specific user. This user will require [read and write permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket.html) to objects in the staging bucket.
* If the Access Key and Secret Access Key are not provided, the authentication will rely on the instanceprofile.
* **Secret Access Key**
* Corresponding key to the above key id.
* Make sure your S3 bucket is accessible from the machine running Airbyte.
* This depends on your networking setup.
* You can check AWS S3 documentation with a tutorial on how to properly configure your S3's access [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-overview.html).
* If you will use instance profile authentication, make sure the role has permission to read/write on the bucket.
* The easiest way to verify if Airbyte is able to connect to your S3 bucket is via the check connection tool in the UI.

## CHANGELOG

| Version | Date | Pull Request | Subject |
| :--- | :--- | :--- | :--- |
| 0.2.4 | 2022-01-10 | [\#9399](https://github.com/airbytehq/airbyte/pull/9399) | Use instance profile authentication if credentials are not provided |
| 0.2.3 | 2022-01-11 | [\#9367](https://github.com/airbytehq/airbyte/pull/9367) | Avro & Parquet: support array field with unknown item type; default any improperly typed field to string. |
| 0.2.2 | 2021-12-21 | [\#8574](https://github.com/airbytehq/airbyte/pull/8574) | Added namespace to Avro and Parquet record types |
| 0.2.1 | 2021-12-20 | [\#8974](https://github.com/airbytehq/airbyte/pull/8974) | Release a new version to ensure there is no excessive logging. |
Expand Down