Switch to CLOCK_BOOTTIME and friends to improve accuracy. #58

tobz · 2021-09-23T20:18:02Z

Per the discussion happening on rust-lang/rust#88714, there's a meaningful difference between CLOCK_BOOTTIME and CLOCK_MONOTONIC when it comes to time across system suspends on Linux. According to the issue, the problem they're trying to solve is that CLOCK_MONOTONIC stops ticking during suspend, while CLOCK_BOOTTIME does not. This raises two problems for quanta:

Monotonic mode

When invariant TSC support is not detected, we fall back to the "monotonic" mode where we query the time directly. This is all fine and good, but we're also querying with CLOCK_MONOTONIC, and similar variants on other platforms. This leaves us open to the exact same problem described in the above issue.

Counter (TSC) mode

While I have not fully traced whether or not this matters, there's a potential reality where CLOCK_MONOTONIC stops ticking during lower CPU power states, such that as we're going through the calibration loop, our reference drifts with every loop we perform. While invariant TSC should be guaranteed to tick at a constant rate -- recent Intel manuals specifically use the language of The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. -- this is moot if our initial reference/source calibration is off, as we need that in order to go from TSC cycles to real time units.

At any rate, switching shouldn't do anything but make things more accurate, but to reference the issue again, there are also some concerns about when the support for it was introduced, and on which platforms it matters. With that in mind, we likely need to wait for that PR to shake out to make sure we have a good example of where we'll need to make our changes.

The text was updated successfully, but these errors were encountered:

tobz · 2021-09-24T01:45:56Z

Thinking about this more...

The issue is specific to when the system is in suspend, which relates to being in a "sleep state", or in ACPI parlance: S1-S5. The aforementioned snippet from the Intel Developers Manual states that the TSC runs at a constant rate in P/C/T-states, but those are mutually exclusive with sleep states.

Thus, if quanta is in counter/TSC mode, it cannot correctly handle the transition from S0 (not suspended, essentially) to S1-5 and back to S0. Regardless of whether or not our calibration is correct, the TSC simply stops, so we're going to lose time.

Given the performance goals of quanta, there's likely two things should do here:

implement the CLOCK_BOOTTIME/CLOCK_MONOTONIC cascaded logic mentioned in the stdlib PR
document that quanta is not guaranteed to maintain wall-time through system suspend

The first just makes sense: if we can, we might as well make the monotonic reference clock as close to wall time accurate as possible. The second, well, also just makes sense: quanta is about performance, but time is important, so we need to raise this limitation.

There's likely a future change we could make to allow only using the monotonic reference clock, for when users want quanta for its ability to mock time, but also need that time to advance with wall time.

tobz · 2021-12-25T18:24:06Z

Thinking more about this still, I don't think switching to CLOCK_BOOTTIME (or equivalents for other platforms) is the correct answer.

Consider our thoughts from the last comment: while CLOCK_MONOTONIC does not advance during suspend, neither should the TSC. Further still, programs should not be calibrating at the very instant that a machine is going to sleep or waking up... and if they were, the loss of TSC advancement or CLOCK_MONOTONIC advancement would skew the calibration loop and lead to an obviously bad result that got averaged out.

The above thought about documenting that quanta does not track time across suspends and so to exercise caution when getting deltas from an Instant is still a correct decision, I believe. I would also add another item to that list, though: maybe we should make Clock avoid using the TSC if a proper calibration cannot be achieved. Using the OS-provided timing facilities is still very fast in most cases, but trading speed for inaccuracy is not a good trade-off except when made explicitly i.e. using Clock::recent.

tobz mentioned this issue Dec 30, 2021

Strong documentation around guarantees, cleaning up API, better calibration, and other stuff. #62

Merged

tobz closed this as completed May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to CLOCK_BOOTTIME and friends to improve accuracy. #58

Switch to CLOCK_BOOTTIME and friends to improve accuracy. #58

tobz commented Sep 23, 2021 •

edited

Loading

tobz commented Sep 24, 2021

tobz commented Dec 25, 2021

Switch to CLOCK_BOOTTIME and friends to improve accuracy. #58

Switch to CLOCK_BOOTTIME and friends to improve accuracy. #58

Comments

tobz commented Sep 23, 2021 • edited Loading

Monotonic mode

Counter (TSC) mode

tobz commented Sep 24, 2021

tobz commented Dec 25, 2021

tobz commented Sep 23, 2021 •

edited

Loading