-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source S3: support IAM EC2 profile #17334
Conversation
b5bafe5
to
4af0e07
Compare
I'll need to clean this up a bit. Will re-open. |
Anyone willing to take a look? |
Sorry the team missed this @mc51 do you mind resolving conflicts? I'll review this after |
@IAL32 thanks for the contribution. If you could please sign the CLA, hopefully we can continue /:e @marcosmarxm: all ready |
Any chance someone is going to look at this after all? (@marcosmarxm) |
@airbytehq/community-assistance-dri can you help with this review? |
It would be great if we can get this merged. Thanks for doing the work @mc51 ! |
I'm not willing to put in any more effort towards this, as the whole process has been painfully slow. |
@mc51 sorry the slow feedback for this amazing contribution. Our team is returning the priority to review PR contirbutions. This is in my to-do list and I hope to resolve the conflicts and get it merged soon. |
@marcosmarxm @mc51 here is my first pass of resolving the conflicts between this PR and the current master. I'm happy to open it as is, or you can adopt the changes since I haven't done anything other than run a rebase. https://github.com/airbytehq/airbyte/compare/master...mitodl:airbyte:source-s3-support-iam-ec2-profile?expand=1 |
track |
🙏 this functionality is necessary when can we expect some progress to be made on merging this? |
@mc51 I have created a custom docker image and deployed in the AWS with the changes, I am able to setup and test source S3 but when I am trying to create connection between custom source S3 and any destination I get the same error of Access denied Internal message: unable to access bucket: 'xxxx--xxx' key: 'airbyte/corpository/live/crawler_output_rss/stg_entity_company_mapping/stg_entity_company_mapping_2023_06_08_1686250027886_0.parquet' version: None error: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied |
Any update on this? How are people using this software without it having the very basic support of default authentication. How is this PR stuck without being merged for almost a year now? |
My guess is that people are either:
Both options feel so hacky and unsecure, and have stopped me from switching to using Airbyte altogether. This is only one of the many Airbyte AWS connectors lacking this functionality. |
Hi everyone, apologies for the holdup. We'll push this to the finish line soon. Thanks for the contribution @mc51 ! |
Closed because the Source S3 was moved to the File Based CDK. |
Hello @mc51 and @sidartha, sorry for closing the contribution you both spend time working on. The basic structure of the connector changed a lot since the contribution. I'd like to have a conversation with you about the contribution and your experience, if possible please reach me in the Airbyte Slack. |
What
Among the previously existing AWS Access Key and Secret and no Credentials support, this allows the use of the default credentials provider chain of boto3 (e.g. use the instance profile on ec2).
This is supposed to finish the work of sidartha #10676
Refers to this issue.
It is useful as it allows to access buckets via role only (i.e. in the case of instance profile on ec2). It makes the use of credentials obsolete in that case. In some business environments (like ours) this is crucial, as role only access is considered best practice.
How
Add Authentication Method Dropdown (
oneOf
field) to UI. Per default, it is set to use AWS Access Key ID and Secret Access Key. Those must be specified as before further down. However, if those values are empty, then the select Authentication Method can also use the AWS Default Credential Provider Chain or No Credentials (unsigned requests, e.g. for public Buckets or non AWS buckets).This shouldn't break previous versions because
aws_access_key_id
andaws_secret_access_key
are still "leading" in defining the Authentication Method. However, the downside is, that the usage of the Authentication Method field might not be straight forward: Whenaws_access_key_id
andaws_secret_access_key
are set, this always overrides the setting on Authentication Method.Also, this implementation should be extensible. For example, one could add Role Switching to it, which might also be very helpful in many settings.
Recommended reading order
🚨 User Impact 🚨
This should not be breaking because it replicates the previous behavior.
Default Credential Provider Chain
is only used when selected by the user andaws_access_key_id
andaws_secret_access_key
are empty.Pre-merge Checklist
Expand the relevant checklist and delete the others.
Updating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereTests
Unit
Results (64.85s):
170 passed
Integration
Some of the tests I can't get running as I don't have some config files (or I misunderstood something) e.g.
E FileNotFoundError: [Errno 2] No such file or directory: '/home/mc/dev/airbyte/airbyte-integrations/connectors/source-s3/secrets/parquet_config.json'
Other than that,
TestDiscovery.test_backward_compatibility
fails. I guess this is to be expected, as the changedschema.json
is not compatible with the oldFileStream
class.