-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨Destination S3: Support IRSA for AWS Authentication #244
Conversation
…its post-refactor home ContainerOrchestratorFactory.java Co-authored by: Richard Gilmore <richard.gilmore@gmail.com>
This change introduces support for using [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html) within S3 destinations in Airbyte. It is based on the changes in airbytehq#177, but does not attempt to add the same support for logging. The change should also work for S3 sources, but is not tested thoroughly with sources, only with destinations.
I'm waiting for this since last year thank you |
Hey @don-code ! We spoke face to face about this issue a couple weeks ago, sorry that I didn't see that you had opened a PR until now! I will make time to review this change this week, I'm excited to get it merged! |
/create-oss-pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@don-code thanks again for the contribution! I left one small change request for favoring direct Micronaut injection over the Configs
interface approach. I've also requested review from members of an adjacent Platform team, who generally own this area of the codebase and should be looped in. I've also opened this PR against our internal repo where we can run a more thorough suite of acceptance tests. I'll check back in soon and I'm aiming to get this merged ideally by the end of this week
...tor/src/main/java/io/airbyte/container_orchestrator/config/ContainerOrchestratorFactory.java
Show resolved
Hide resolved
airbyte-config/config-models/src/main/java/io/airbyte/config/Configs.java
Show resolved
Hide resolved
Hey @don-code, wanted to give you a status update. I opened a PR off of your branch in our internal repo where we can perform more extensive testing. While doing this, I realized that we likely want to adjust our async orchestrator pod creation to read the same service account config that you've introduced here. Today, we hardcode the I can make that change on the internal repo either today or tomorrow, depending on bandwidth, and once merged internally, any adjustments I end up making will be mirrored back to this public repo. |
@don-code one more update here. In our internal repo, I made the change to the Should be close to merging! Will update again here soon, likely tomorrow. |
All tests on our Cloud environment were successful, and the internal PR is ready to merge. We tend to avoid merging on Fridays so I will get this change out on Monday morning. It will then be mirrored back to this public repo, at which point we can close out this PR. Thanks again @don-code for the contribution! |
@pmossman just want to say thank you for all of the extra work you guys did here! I just pulled in 0.50.4 and dropped our local fork - everything seems to be working great. |
Glad to hear it @don-code!! |
What
This change introduces support for using IAM Roles for Service Accounts (IRSA) within S3 destinations in Airbyte. It is based on the changes in #177 (formerly airbytehq/airbyte#19010), but does not attempt to add the same support for logging. The change should also work for S3 sources, but is not tested thoroughly with sources, only with destinations.
How
As it stands today, destinations require an IAM access key and secret key to be passed. Even though the fields are defined as optional in the UI, not setting them in a Kubernetes cluster using IRSA results in an unhelpful
null
error message being thrown. This is because, although Airbyte's ServiceAccount may be configured properly for IRSA, job pods are not associated with that ServiceAccount:This change makes it so that job pods are associated with the Airbyte install's ServiceAccount:
Recommended reading order
.gitignore
- I ran the unit tests locally, and this file was produced. Unrelated to this PR, but seemingly still helpful.charts/airbyte-worker/templates/deployment.yaml
- where the ServiceAccount name is brought into the application's scope, so that it is not assumingairbyte-admin
.airbyte-workers/src/main/resources/application.yml
- where the environment variable from 2 is brought into application config.airbyte-config/config-models/src/main/java/io/airbyte/config/EnvConfigs.java
- where the environment variable from 2 is brought into application code.airbyte-config/config-models/src/main/java/io/airbyte/config/Configs.java
- an interface where the service account name will be held.airbyte-config/config-models/src/main/java/io/airbyte/config/EnvConfigs.java
- where this interface is implemented.airbyte-container-orchestrator/src/main/java/io/airbyte/container_orchestrator/config/ContainerOrchestratorFactory.java
- where the value from the implementing class is used.airbyte-commons-worker/src/main/java/io/airbyte/workers/process/KubeProcessFactory.java
- where the implemented class itself is used.airbyte-workers/src/main/java/io/airbyte/workers/config/ProcessFactoryBeanFactory.java
- where the implemented class is instantiated.airbyte-config/config-models/src/main/java/io/airbyte/config/storage/DefaultS3ClientFactory.java
- where the actual check with S3 credentials is performed.airbyte-workers/src/test-integration/java/io/airbyte/workers/process/KubePodProcessIntegrationTest.java
- the integration tests for KubePodProcess, which do not strictly need to change, but have changed for consistency's sake.Can this PR be safely reverted / rolled back?
If unsure, leave it blank.
🚨 User Impact 🚨
I've tested that destinations still work as expected, both the IRSA case where the access key and secret key are left blank...

...and the non-IRSA case, with an IAM user - which I additionally tested with a user with access to a bucket which the IRSA role did not:
...so I believe that there should be no user impact, save for the added ability to use IRSA roles.