Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make operator's execution_timeout configurable #22389

Merged
merged 9 commits into from
Apr 11, 2022

Conversation

sagmansercan
Copy link
Contributor

@sagmansercan sagmansercan commented Mar 21, 2022

This PR aims to make the execution_timeout attribute to be configurable globally via airflow.cfg.

  • The default value is still None. Users are expected to
    define an integer value to be passed into timedelta object
    to set the timeout in terms of seconds by default, via configuration.
  • Added gettimedelta method in configuration to be used in abstractoperator
    to get timedelta or None type object. The method raises exceptions
    for the values that are;
    • not convertible to integer
    • too large to be converted to C int.
  • Sample config cases are added into unit tests.

Closes #18578

@boring-cyborg
Copy link

boring-cyborg bot commented Mar 21, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@uranusjr
Copy link
Member

Since it is possible to provide a None execution_timeout, the config format also need to support that.

@sagmansercan
Copy link
Contributor Author

Since it is possible to provide a None execution_timeout, the config format also need to support that.

Thanks for the comment, could you provide more details on your suggestion with an example use case?

The default behavior is already None. One can leave this section key empty, or even not define it in its cfg file. This approach tries to cover the "providing None" case with an empty value.

Are you suggesting that the string "None" should also be handled? If so, It would be great to hear what is the difference between "empty string" and "None" is, based on use cases. Additionally, how can we achieve that without explicitly checking if the given string is equal to "None". Also eval does this, but either way both of them do not sound right to me.

@sagmansercan sagmansercan requested review from eladkal and ashb March 21, 2022 18:17
@sagmansercan sagmansercan force-pushed the configurable-execution-timeout branch from 07da0fc to b28d49c Compare March 21, 2022 18:48
@eladkal eladkal added this to the Airflow 2.3.0 milestone Mar 21, 2022
@sagmansercan sagmansercan force-pushed the configurable-execution-timeout branch from cf91704 to 59b82c3 Compare March 25, 2022 14:44
@sagmansercan
Copy link
Contributor Author

Hi @ashb @eladkal @uranusjr @SamWheating

I've addressed all the comments above, could you please review them again?

@sagmansercan sagmansercan force-pushed the configurable-execution-timeout branch from 59b82c3 to 2fcfcd7 Compare March 29, 2022 10:09
@Bowrna
Copy link
Contributor

Bowrna commented Mar 30, 2022

@sagmansercan you have to fix the failing static check in the black formatting. Please check.

@sagmansercan sagmansercan force-pushed the configurable-execution-timeout branch from 2fcfcd7 to be3b7fd Compare March 30, 2022 10:55
@sagmansercan
Copy link
Contributor Author

sagmansercan commented Mar 30, 2022

@sagmansercan you have to fix the failing static check in the black formatting. Please check.

Thank you for warning @Bowrna, seems like it is related to the black version and fixed in #22598

I will try rebasing and see the result after that because it has not resulted from my changes

By this commit, execution_timeout attribute of the
operators is now configurable globally via airflow.cfg.

* The default value is still `None`. Users are expected to
define a positive integer value to be passed into timedelta object
to set timeout in terms of seconds by default, via configuration.
* If the key is missing or is set to a non-positive value, then it is
considered as `None`.
* Added `gettimedelta` method to be used in abstractoperator
to get timedelta or None type object. The method raises exception
for the values that are not convertible to integer and/or the values
too large to be converted to C int.
* Sample config cases are added into unit tests.

Closes apache#18578
* By this commit, error raises for the values <= 0
instead of using fallback value
* Updated unit tests
To be more clear to the user, added relevant error message
into to AirflowConfigException.
This parameter specifies the tasks' execution timeout,
so all configuration and variable names are now contains
`task` in it.
fixed the description of default_task_execution_timeout
based on the recent changes
Before this commit, gettimedelta method was preventing
user to provide non-positive values. Now it is totally up to
users to provide a sensible value for this configuration
@sagmansercan sagmansercan force-pushed the configurable-execution-timeout branch from be3b7fd to 7f30f25 Compare April 10, 2022 12:46
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Apr 11, 2022
@potiuk potiuk merged commit a111a79 into apache:main Apr 11, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Apr 11, 2022

Awesome work, congrats on your first merged pull request!

@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge type:new-feature Changelog: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to set execution_timeout default value in airflow.cfg
10 participants