-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm Chart: Running the launcher replication-orchestrator failed
after upgrade
#32203
Comments
Confirmed that I experienced the same, and downgrading to 0.49.6 resolved my issue. |
Same as above! Worked for me as fix as well |
Confirming that we were unable to deploy Airbyte when upgrading beyond Also came across this other issue which may be related: |
still bad: 0.49.21, 0.49.19 |
Same here, I can't upgrade to the latest version |
Same issue here. Had to roll the chart back to 0.49.6. |
2 weeks ago I also reported a bug here: #32544 It looks the issue came with Airbyte 0.50.33 and is present in 0.50.34. |
Hello all 👋 the team made some investigation and found a workaround For now this is the step to fix the issue: After run the following commands: mc alias set myminio http://localhost:9000 minio minio123
mc mb myminio/state-storage This will create a missing bucket. The team is working to release a fix for future upgrades. Thanks for the patience. |
Running the launcher replication-orchestrator failed
after upgrade
This closes airbytehq/airbyte#32203. #9469 turned on the orchestrator by default for OSS Kube deployments. Before this, OSS Kube jobs would fail whenever Airbyte is deployed. When we turned this on, I did not occur to me to test the upgrade path. What our helm charts do is recreate the airbyte-ab and minio pods each time. This would wipe the state bucket, so jobs would not be able to run after an upgrade. This PR cleans up the airbtye-db and airbyte-minio behaviour with the side effect of fixing this bug. - Instead of recreating the airbyte db and the minio pod each time, we only create these critical resource on install. Once airbyte is running, there is no situation where recreating these resources on upgrade is needed. In fact, this is harmful since all jobs running at that time will fail. This also slows down the upgrade since these resources are required before the actual Airbyte application can start up. - Pin minio to a specific version instead of always pulling the latest version. Although we haven't yet seen minio version bug issues, pinning to a specific version provides more stability. - Do the same for kubectl.
Hi guys, we figured out what was happening:
Thank you for your patience! |
Hi guys, thank you so much for your effort! unfortunately, I just tried to deploy the newest version, but still have the same issue. After starting any new job (deployment with the chart version 0.50.3), I still get:
I just rolled back again to the version 0.49.6, and all works well as expected. |
@DSamuylov interesting! I just tested upgrading from 0.49.6 - 0.50.3 and was able to run a job before/after. Can you show me how you are deploying 0.50.3? |
@davinchia, yes, sure. I do the deployment with terraform, and here is the file defining all configuration: resource "helm_release" "airbyte" {
name = "airbyte"
repository = "https://airbytehq.github.io/helm-charts"
chart = "airbyte"
version = var.chart_version
namespace = var.k8s_namespace
# Global environment variables:
set {
name = "global.env_vars.DATABASE_URL"
value = "jdbc:postgresql://${var.external_database_host}:${var.external_database_port}/${var.external_database_database}?ssl=true&sslmode=require"
}
# Global database settings:
set {
name = "global.database.secretName"
value = var.postgres_secrets_name
}
set {
name = "global.database.secretValue"
value = "password"
}
set {
name = "global.database.host"
value = var.external_database_host
}
set {
name = "global.database.port"
value = var.external_database_port
}
# Global logs settings:
# - storage type config:
set {
name = "global.state.storage.type"
value = "S3"
}
set {
name = "global.logs.storage.type"
value = "S3"
}
set {
name = "minio.enabled"
value = false
}
set {
name = "global.logs.minio.enabled"
value = false
}
# - access key config:
# Some pods in the deployment uses a password variable, and some read the value from k8s secret, that is why we are forced to indicate both:
set {
name = "global.logs.accessKey.password"
value = var.aws_access_key_id
}
set {
name = "global.logs.accessKey.existingSecret"
value = var.aws_secrets_name
}
set {
name = "global.logs.accessKey.existingSecretKey"
value = "AWS_ACCESS_KEY_ID"
}
# - secret key config:
# Some pods in the deployment uses a password variable, and some read the value from k8s secret, that is why we are forced to indicate both:
set {
name = "global.logs.secretKey.password"
value = var.aws_secret_access_key
}
set {
name = "global.logs.secretKey.existingSecret"
value = var.aws_secrets_name
}
set {
name = "global.logs.secretKey.existingSecretKey"
value = "AWS_SECRET_ACCESS_KEY"
}
# - bucket config:
set {
name = "global.logs.s3.enabled"
value = true
}
set {
name = "global.logs.s3.bucket"
value = var.aws_bucket
}
set {
name = "global.logs.s3.bucketRegion"
value = var.aws_region
}
# Temporal:
set {
name = "temporal.env_vars.SQL_TLS"
value = "true"
}
set {
name = "temporal.env_vars.SQL_TLS_DISABLE_HOST_VERIFICATION"
value = "true"
}
set {
name = "temporal.env_vars.SQL_TLS_ENABLED"
value = "true"
}
set {
name = "temporal.env_vars.SQL_TLS_ENABLE"
value = "true"
}
set {
name = "temporal.env_vars.SSL"
value = "true"
}
# External database settings:
set {
name = "postgresql.enabled"
value = false
}
set {
name = "externalDatabase.host"
value = var.external_database_host
}
set {
name = "externalDatabase.user"
value = var.external_database_user
}
set {
name = "externalDatabase.password"
value = var.external_database_password
}
set {
name = "externalDatabase.existingSecret"
value = var.postgres_secrets_name
}
set {
name = "externalDatabase.existingSecretPasswordKey"
value = "password"
}
set {
name = "externalDatabase.database"
value = var.external_database_database
}
set {
name = "externalDatabase.port"
value = var.external_database_port
}
set {
# When using SSL, it is mandatory to specify the URL parameters: `?ssl=true&sslmode=require"`!
name = "externalDatabase.jdbcUrl"
value = "jdbc:postgresql://${var.external_database_host}:${var.external_database_port}/${var.external_database_database}?ssl=true&sslmode=require"
}
# Worker:
set {
name = "worker.extraEnv[0].name"
value = "STATE_STORAGE_S3_ACCESS_KEY"
}
set {
name = "worker.extraEnv[0].value"
value = var.aws_access_key_id
}
set {
name = "worker.extraEnv[1].name"
value = "STATE_STORAGE_S3_SECRET_ACCESS_KEY"
}
set {
name = "worker.extraEnv[1].value"
value = var.aws_secret_access_key
}
set {
name = "worker.extraEnv[2].name"
value = "STATE_STORAGE_S3_BUCKET_NAME"
}
set {
name = "worker.extraEnv[2].value"
value = var.aws_bucket
}
set {
name = "worker.extraEnv[3].name"
value = "STATE_STORAGE_S3_REGION"
}
set {
name = "worker.extraEnv[3].value"
value = var.aws_secrets_name
}
}
Please let me know if I could further support you with the investigation. |
@DSamuylov what is your state storage bucket name variable set to? |
@davinchia, do you mean the environment variable set {
name = "worker.extraEnv[2].name"
value = "STATE_STORAGE_S3_BUCKET_NAME"
}
set {
name = "worker.extraEnv[2].value"
value = var.aws_bucket
} The variable |
Yes that was what I was referring to. Follow up questions:
|
Sorry for the delay in my reply, last days were extremely busy.
set {
name = "minio.enabled"
value = false
}
set {
name = "global.logs.minio.enabled"
value = false
} So if some pods require access to it, they will fail.
|
We are also facing this issue after upgrading to |
We were facing the same issue, we updated to Airbyte 0.50.43 with the Chart 0.50.21 and the error comes up, this time the error log was more detailed:
We checked the Minio version and updated to the latest, and then executed the workaround mentioned by @marcosmarxm, and it worked. Thanks @marcosmarxm
|
I've been pinning version 0.49.6 to get around this for the past month and a half. Trying the fix suggested by @marcosmarxm doesn't fix for me. After attempting upgrading from 0.49.6 -> latest since mid Jan (so 0.50.22+) it has never fixed the issue. Running the minio config in bash returns:
Not an expert in any of this at all, but it looks like the creation of the bucket isn't entirely the issue. Just wanted to provide additional info as this has been a long-open issue! Edited to add:
|
Folks everyone having this issue. Please open a new issue and report what values and version you're using. |
What method are you using to run Airbyte?
Kubernetes
Platform Version or Helm Chart Version
All Helm chart versions later than 0.49.6
What step the error happened?
Upgrading the Platform or Helm Chart
Revelant information
Myself and others in Slack are reporting a variety of issues on Helm charts newer than 0.49.6. For each person, downgrading to 0.49.6 resolved the issues and Airbyte is stable. On later versions than 0.49.6, Airbyte is not stable. Connectors, tests, etc. fail with various errors. Slack thread: https://airbytehq.slack.com/archives/C021JANJ6TY/p1698930804469959
Relevant log output
The text was updated successfully, but these errors were encountered: