Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Clock.sleep_for using sleep_until #864

Merged
merged 4 commits into from
Dec 17, 2021
Merged

Add Clock.sleep_for using sleep_until #864

merged 4 commits into from
Dec 17, 2021

Conversation

sloretz
Copy link
Contributor

@sloretz sloretz commented Dec 13, 2021

Resolves #617
Requires #858

@sloretz sloretz added the enhancement New feature or request label Dec 13, 2021
@sloretz sloretz self-assigned this Dec 13, 2021
Comment on lines +347 to +352
# wait for thread to get inside sleep_for call
time.sleep(0.2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like much a test depending on a race condition.
I would rather not have a test if this is the only way.

By the way, I don't get what's the intention of the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not my favorite either. Another way may be to inspect the current frame of the other thread to see what code it's exiting, but I don't think I can implement that robustly in a reasonable time.

#858 (comment)


The ros_time_toggled tests enable or disable ROS time while the method is sleeping. Since that switches from system time to /clock time, it's not clear what amount of time should be slept when that happens. This test makes sure the method wakes up and returns False.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reasonable alternative is to toggle the clock type in a loop, wait until retval is not None and then assert that retval is False.
The loop could have a certain maximum duration, and we could make it "less busy" be sleeping for a bit between each time we toggle the clock type.

e.g.:

    retval = None

    def run():
        nonlocal retval
        retval = clock.sleep_for(Duration(seconds=999999))

    t = threading.Thread(target=run)
    t.start()

    start = time.now()
    # wait for thread to get inside sleep_for call
    while retval is None and time.now() < start + 0.5:
            clock._set_ros_time_is_active(ros_time_enabled)
            ros_time_enabled = not ros_time_enabled

    assert retval is False
    t.join()

Not really amazing, and it doens't make much sense to have the test parameterized with ros_time_enabled (as we really don't now which of the toggles made the clock stop waiting), but probably a bit more robust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, I'm okay with the current approach, but we should track CI the first few days to see if the tests are reliable.

t = threading.Thread(target=run, daemon=True)
t.start()

# wait for thread to get inside sleep_for call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does is matter if shutdown was called before or after the thread got inside sleep_for?
It seems that it doesn't, in that case I would remove the sleep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. This test checks that the sleep method will sleep on an initially valid context and then wake if it was shutdown. If shutdown was called before the thread got inside sleep_for then the method would raise NotInitializedException instead.

Copy link
Member

@ivanpauno ivanpauno Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I don't have a suggestion of how to improve this one.
I'm not sure if the inconsistency makes sense, as while trying to shutdown asynchronously it would trigger different behaviors depending if the wait_until()/wait_for() call already happened or not.

I think this approach is acceptable for the moment.

@delete-merged-branch delete-merged-branch bot deleted the branch master December 15, 2021 21:00
@sloretz sloretz changed the base branch from rclpy__sleep_until to master December 15, 2021 21:01
Signed-off-by: Shane Loretz <sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@openrobotics.org>
Signed-off-by: Shane Loretz <sloretz@openrobotics.org>
@sloretz
Copy link
Contributor Author

sloretz commented Dec 15, 2021

Rebased and incorporated feedback in tests from #858 that also apply here

@sloretz sloretz marked this pull request as ready for review December 15, 2021 22:09
@sloretz
Copy link
Contributor Author

sloretz commented Dec 15, 2021

CI (build: --packages-up-to rclpy test: --packages-select rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Copy link

@j-rivero j-rivero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor Nit on the documentation. Looks good to me.

There is value in the comments about how to make the test more robust and I would like not to lose them after merging. Not sure if adding the URLs as comments in the code or opening an issue would help in the future to reach these comments if the test goes flaky at some point. I leave up to you @sloretz.

Thanks for the work!.

@sloretz
Copy link
Contributor Author

sloretz commented Dec 17, 2021

There is value in the comments about how to make the test more robust and I would like not to lose them after merging. Not sure if adding the URLs as comments in the code or opening an issue would help in the future to reach these comments if the test goes flaky at some point. I leave up to you @sloretz.

They're valid comments about the test being potentially flaky. I think the right thing to do is to watch the repeated jobs for failures, especially Windows where IIUC the OS scheduler waits a while between context switches. I looked through the last 3 repeated jobs and I don't see any sleep_until failures, so it's looking good that these won't be flaky either. I linked the very latest if you'd like to click back through them too.

If they become flaky in the future, links to the discussion are always available by doing git blame, looking at the PR number in the title of the commit (github adds it during Squash and merge), and then opening it. An issue doesn't seem to be necessary because the tests aren't showing flaky behavior on CI. I think this is fine as is.

@sloretz
Copy link
Contributor Author

sloretz commented Dec 17, 2021

Resolved PR comments with Jose out-of-band

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add sleep function that respects ROS time
3 participants