Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a backup implementation in AWS MwaaHook for calling the MWAA API #47035

Merged
merged 6 commits into from
Mar 6, 2025

Conversation

ramitkataria
Copy link
Contributor

The existing implementation doesn't work when the user doesn't have airflow:InvokeRestApi permission in their IAM policy or when they make more than 10 transactions per second.

This implementation mitigates those issues by using a session token approach. However, my existing implementation is still used by default because it is simpler.

Some context here:
https://docs.aws.amazon.com/mwaa/latest/userguide/access-mwaa-apache-airflow-rest-api.html


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the provider:amazon AWS/Amazon - related issues label Feb 24, 2025
@ramitkataria
Copy link
Contributor Author

After discussion with @o-nikolas, I ended up removing the storage of sessions in hook instances because they would get re-instantiated again anyway. The only feasible way seems to be to serialize, encrypt and store them in the file system, which can be an improvement implemented in the future.

I was going to implement retries using tenacity so that the hook would wait instead of failing in case of throttling but I did some performance testing to see the how it would handle a throttling scenario and it looks like boto waits instead of returning an error response so that shouldn't happen. Either way, if someone does manage to get errors related to throttling and wants the hook to retry instead of fail, boto has this guide about setting up retries which seems like the best way to go about this.

@ramitkataria ramitkataria marked this pull request as ready for review March 1, 2025 03:57
@ramitkataria ramitkataria force-pushed the ramitkataria/mwaa-web-token branch 2 times, most recently from 1b804cf to 3502f23 Compare March 4, 2025 03:19
The existing implementation doesn't work when the user doesn't have
`airflow:InvokeRestApi` permission in their IAM policy or when they make
more than 10 transactions per second.

This implementation mitigates those issues by using a session token
approach. However, my existing implementation is still used by default
because it is simpler.

Some context here:
https://docs.aws.amazon.com/mwaa/latest/userguide/access-mwaa-apache-airflow-rest-api.html
@ramitkataria ramitkataria force-pushed the ramitkataria/mwaa-web-token branch from 7a5d6d8 to 62a5eef Compare March 6, 2025 19:31
@o-nikolas o-nikolas merged commit 6c6a4a6 into apache:main Mar 6, 2025
60 checks passed
@o-nikolas o-nikolas deleted the ramitkataria/mwaa-web-token branch March 6, 2025 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider:amazon AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants