-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Core: Support for IRSA when logging on EKS and passing role to Dest/Source Pods #19010
Conversation
might be of interest @alafanechere as you reviewed #10682 |
Hello 👋, first thank you for this amazing contribution. We really appreciate the effort you've made to improve the project. If you have any questions feel free to send me a message in Slack! |
Any update on this? We are pretty blocked waiting on this. |
@@ -122,10 +126,12 @@ public ProcessFactory discoverDockerProcessFactory( | |||
public ProcessFactory discoverKubernetesProcessFactory( | |||
@Named("discoverWorkerConfigs") final WorkerConfigs workerConfigs, | |||
@Value("${airbyte.worker.job.kube.namespace}") final String kubernetesNamespace, | |||
@Value("${airbyte.worker.job.kube.service.account") final String kubernetesServiceAccount, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is missing from the file application.yaml
in the resource folder of the airbyte-worker
module. This will cause some issue while being created.
podBuilder = podBuilder.withServiceAccount("airbyte-admin").withAutomountServiceAccountToken(true); | ||
} | ||
.withNewSpec() | ||
.withServiceAccount(isOrchestrator ? "airbyte-admin" : serviceAccount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a dedicated variable for the orchestrator SA and well as the one added in order to have something like:
.withServiceAccount(isOrchestrator ? orchestratorServiceAccount : serviceAccount)
. If we go that way, we need to make sure that the default value is "airbyte-admin"
.
@@ -119,6 +119,8 @@ spec: | |||
valueFrom: | |||
fieldRef: | |||
fieldPath: metadata.namespace | |||
- name: JOB_KUBE_NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a typo and should be JOB_KUBE_SERVICE_ACCOUNT right?
- name: JOB_KUBE_NAMESPACE | |
- name: JOB_KUBE_SERVICE_ACCOUNT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has been migrated, @mjstel if you can please re-add your suggestion to the new PR so your authorship is preserved: https://github.com/airbytehq/airbyte-platform/pull/177/files#r1120705574
Hello 👋:skin-tone-2: and thank you for your contribution! Airbyte has instituted a code freeze between 19 and 30 December, to make sure there are no disruptions during the holidays. If you have any questions or need further clarification, please don't hesitate to ping via Slack. |
Hey, any update on this? |
will take a look at the review comments |
+1 running into this issue with helm chart! |
Hello @gilandose, as you may have noticed we've just finished a pretty big overhaul of our code repositories. Most of this pull request is now located in a new repo airbytehq/airbyte-platform. I've gone ahead and cherry picked your changes into a new PR: airbytehq/airbyte-platform#177 I was not able to base the PR from your fork as I don't have write permissions, so I created a new branch I've also copied over the comments/suggestions on this PR, so hopefully we can continue and merge this exciting new feature. |
I've added changes that are included in this pr here: #23594 Alternatively, we can merge from airbytehq/airbyte master and use this PR -- however I don't have write access so I can't push changes. So, we have a new PR in the meantime 😄 |
What
DefaultAWSCredentialsProviderChain into the logging route for S3 this allows logs on EKS or EC2 to use standard more secure role assumption methods.
Specifically this unlocks IRSA by removing passing of access keys, and adding STS dependency to allow
WebIdentityTokenProvider
to be used. Note this would all allow a GKE cluster to interface with AWS using IRSAAdditionally allow the
airbyte-admin
ServiceAccount to be passed to connectors to allow the IRSA role to be assumed here. Note it may be preferable to assign a different service account for the dest/source pod eventually this could be a connector specific SA with restrictive permissionsresolves #10570
resolves #5942
extends excellent work done in: #10682 credit: @adamschmidt
How
For AWS access keys can automatically be read from the DefaultAWSCredentialsProviderChain, these were being passed around too eagerly and also set into system properties for logging. To change this I've removed some of the explicit passing, and allow the access keys to be passed around transparently when set
Recommended reading order
deps.toml
airbyte-config/config-models/src/main/java/io/airbyte/config/storage/DefaultS3ClientFactory.java
airbyte-commons/src/main/resources/log4j2.xml
airbyte-commons-worker/src/main/java/io/airbyte/workers/process/KubePodProcess.java
build.gradle
airbyte-integrations/connectors/destination-s3/build.gradle
🚨 User Impact 🚨
No user impact using ACCESS_KEYS works as before
Pre-merge Checklist
Expand the relevant checklist and delete the others.
Updating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereTests
Unit
Task :airbyte-config:config-models:test
ConfigSchemaTest > testFile() PASSED
ConfigSchemaTest > testPrepareKnownSchemas() PASSED
DataTypeEnumTest > testConversionFromJsonSchemaPrimitiveToDataType() PASSED
EnvConfigsTest > testAirbyteVersion() PASSED
EnvConfigsTest > testGetDatabaseUrl() PASSED
EnvConfigsTest > testSharedJobEnvMapRetrieval() PASSED
EnvConfigsTest > testJobKubeNodeSelectors() PASSED
EnvConfigsTest > testWorkspaceRoot() PASSED
EnvConfigsTest > ensureGetEnvBehavior() PASSED
EnvConfigsTest > testLocalRoot() PASSED
EnvConfigsTest > testDiscoverKubeNodeSelectors() PASSED
EnvConfigsTest > testPublishMetrics() PASSED
EnvConfigsTest > testConfigRoot() PASSED
EnvConfigsTest > testGetDatabasePassword() PASSED
EnvConfigsTest > testSpecKubeNodeSelectors() PASSED
EnvConfigsTest > testAirbyteRole() PASSED
EnvConfigsTest > testCheckKubeNodeSelectors() PASSED
EnvConfigsTest > testDeploymentMode() PASSED
EnvConfigsTest > testTrackingStrategy() PASSED
EnvConfigsTest > testGetWorkspaceDockerMount() PASSED
EnvConfigsTest > testDockerNetwork() PASSED
EnvConfigsTest > testGetLocalDockerMount() PASSED
EnvConfigsTest > testRemoteConnectorCatalogUrl() PASSED
EnvConfigsTest > testGetDatabaseUser() PASSED
EnvConfigsTest > Should parse constant tags PASSED
EnvConfigsTest > testSplitKVPairsFromEnvString() PASSED
EnvConfigsTest > testworkerKubeTolerations() PASSED
EnvConfigsTest > testErrorReportingStrategy() PASSED
EnvConfigsTest > testAllJobEnvMapRetrieval() PASSED
EnvConfigsTest > CheckJobResourceSettings > checkJobCpuRequest should take precedent if set PASSED
EnvConfigsTest > CheckJobResourceSettings > checkJobMemoryLimit should take precedent if set PASSED
EnvConfigsTest > CheckJobResourceSettings > checkJobMemoryRequest should take precedent if set PASSED
EnvConfigsTest > CheckJobResourceSettings > should default to JobMainMemoryRequest if not set PASSED
EnvConfigsTest > CheckJobResourceSettings > should default to JobMainCpuRequest if not set PASSED
EnvConfigsTest > CheckJobResourceSettings > checkJobCpuLimit should take precedent if set PASSED
EnvConfigsTest > CheckJobResourceSettings > should default to JobMainCpuLimit if not set PASSED
EnvConfigsTest > CheckJobResourceSettings > should default to JobMainMemoryLimit if not set PASSED
CloudLogsClientTest > createCloudLogClientTestAws() PASSED
CloudLogsClientTest > createCloudLogClientTestGcs() PASSED
CloudLogsClientTest > createCloudLogClientTestMinio() PASSED
LogClientSingletonTest > testGetJobLogFileNullPath() PASSED
LogClientSingletonTest > testGetJobLogFileEmptyPath() PASSED
LogClientSingletonTest > testGetJobLogFileK8s() PASSED
ScheduleHelpersTest > testGetSecondsInUnit() PASSED
ScheduleHelpersTest > testAllOfTimeUnitEnumValues() PASSED
StateMessageHelperTest > testEmpty() PASSED
StateMessageHelperTest > testDuplicatedGlobalState() PASSED
StateMessageHelperTest > testLegacyInNewFormat() PASSED
StateMessageHelperTest > testLegacyStateConversion() PASSED
StateMessageHelperTest > testEmptyList() PASSED
StateMessageHelperTest > testStreamForceLegacy() PASSED
StateMessageHelperTest > testInvalidMixedState() PASSED
StateMessageHelperTest > testGlobalStateConversion() PASSED
StateMessageHelperTest > testGlobal() PASSED
StateMessageHelperTest > testLegacy() PASSED
StateMessageHelperTest > testGlobalForceLegacy() PASSED
StateMessageHelperTest > testLegacyInList() PASSED
StateMessageHelperTest > testStream() PASSED
StateMessageHelperTest > testStreamStateConversion() PASSED
YamlListToStandardDefinitionsTest > verifyAndConvertToModelList > should error out on bad data PASSED
YamlListToStandardDefinitionsTest > verifyAndConvertToModelList > should error out on duplicate name PASSED
YamlListToStandardDefinitionsTest > verifyAndConvertToModelList > should correctly read yaml file PASSED
YamlListToStandardDefinitionsTest > verifyAndConvertToModelList > should error out on duplicate id PASSED
YamlListToStandardDefinitionsTest > verifyAndConvertToModelList > should error out on empty file PASSED
YamlListToStandardDefinitionsTest > vertifyAndConvertToJsonNode > should error out on bad data PASSED
YamlListToStandardDefinitionsTest > vertifyAndConvertToJsonNode > should error out on duplicate name PASSED
YamlListToStandardDefinitionsTest > vertifyAndConvertToJsonNode > should correctly read yaml file PASSED
YamlListToStandardDefinitionsTest > vertifyAndConvertToJsonNode > should error out on duplicate id PASSED
YamlListToStandardDefinitionsTest > vertifyAndConvertToJsonNode > should error out on empty file PASSED
CloudLogsClientTest > testGcs() PASSED
CloudLogsClientTest > testGcsMissingBucket() PASSED
DefaultS3ClientFactoryTest > testS3() PASSED
DefaultS3ClientFactoryTest > testS3RegionNotSet() PASSED
MinioS3ClientFactoryTest > testMinio() PASSED
MinioS3ClientFactoryTest > testMinioEndpointMissing() PASSED
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.