Fix timing issues in subscription tests #76

rjwills28 · 2023-06-21T14:37:33Z

This PR fixes the issue described in #64.

There is a timing issue relating to when you start a subscription to Coniql with respect to where the PV is in its update cycle (updating at 2Hz). Occasionally this can mean that you don't collect as many results in the time period as you expect, which causes the tests to fail.

Instead, I have changed these type of tests so that they collect 3 samples and check that no updates were missed (i.e. the results increment by 1 each time). I have also removed the need to ensure that the updating PV is reset to 0 for each test, which can also lead to some timing problems.

I've rerun the tests multiple times and no longer see failures but as this was always an intermittent problem we should still be aware of it.

AlexanderWells-diamond

Thanks for investigating this, good spot that the failure could happen to either protocols. This change is definitely an improvement to the current intermittent failures we see.

I do see one problem: the two modified tests will never terminate if something goes wrong with either the IOC or the subscriptions such that we never receive data. I'd be inclined to add back in the timer, with a large timeout value ( 20 seconds or so), and a pytest.fail() if it ever exceeds the timeout.

tests/conftest.py

tests/test_aiohttp.py

tests/test_caplugin.py

Also add timeout on blocking subscriptions.

codecov · 2023-06-23T09:23:58Z

Codecov Report

Merging #76 (a14e0fa) into main (8f9c7de) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main      #76   +/-   ##
=======================================
  Coverage   93.30%   93.30%           
=======================================
  Files          10       10           
  Lines         807      807           
=======================================
  Hits          753      753           
  Misses         54       54

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

rjwills28 · 2023-06-23T09:27:59Z

Adding a timeout for the subscription responses is a good idea. I discovered while doing this that we were previously actually not accounting for this as the awaits for the responses is blocking and so we would sit here forever waiting if no data is sent from the websocket. I have added an await_for to catch this case. However the old ws protocol constantly sends 'keep alive' messages so we do also need to implement a timeout ourselves as suggested. Comments have been added to the code to describe why both are necessary.

AlexanderWells-diamond · 2023-06-27T11:23:25Z

I was running test_subscribe_pv on my PC and noticed an odd behaviour. I deleted the ioc parameter, so there are no PVs at all, and yet somehow test_subscribe_pv[graphql_transport_ws_protocol] PASSED was reported! The other test did fail as expected: test_subscribe_pv[graphql_ws_protocol] FAILED

Strangely, if I just run this one test in isolation (i.e. pytest tests/test_aiohttp.py::test_subscribe_pv) it correctly fails both, with different errors.

I'm concerned that we have some leakage of test state somewhere - perhaps the IOCs aren't shutting down fast enough, and future tests are catching the IOC meant for the previous test? I note that our PV prefix is currently global across all tests in a given run.

Sorry for finding more issues. If you feel this problem deserves its own PR I can shift this to an Issue.

rjwills28 · 2023-06-27T15:57:28Z

The problem is that we run the ioc as a 'module' fixture, which means that the ioc clean up does not get called into the module has completed. So even though the ioc is not included as part of the test_subscribe_pv, the ioc is still running and so the tests can pass. Interestingly I find that both protocols are passing in that test. The only way way I can think to fix this would be to switch the ioc to 'function' but that means that we would have to start the ioc for each test, which we were previously trying to avoid as it slows down the tests.

AlexanderWells-diamond · 2023-06-28T07:25:45Z

Ah, I hadn't spotted that. In that case, I don't entirely understand why my tests fail as the IOC should still be running!

Either way, this is better than it was so I think we can merge at this point.

AlexanderWells-diamond

Looks good!

Fix timing issues in subscription tests

f29ce24

rjwills28 requested review from aawdls and AlexanderWells-diamond June 21, 2023 14:37

AlexanderWells-diamond requested changes Jun 22, 2023

View reviewed changes

tests/conftest.py Outdated Show resolved Hide resolved

tests/test_aiohttp.py Outdated Show resolved Hide resolved

tests/test_caplugin.py Outdated Show resolved Hide resolved

Improve reliability of timing sensitive tests

a14e0fa

Also add timeout on blocking subscriptions.

rjwills28 requested a review from AlexanderWells-diamond June 26, 2023 15:43

AlexanderWells-diamond approved these changes Jun 28, 2023

View reviewed changes

rjwills28 merged commit 91bf0c9 into main Jun 28, 2023

rjwills28 deleted the fix_intermittent_test_failures branch June 28, 2023 08:09

rjwills28 mentioned this pull request Jun 30, 2023

Intermittent CI test failures #64

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix timing issues in subscription tests #76

Fix timing issues in subscription tests #76

rjwills28 commented Jun 21, 2023

AlexanderWells-diamond left a comment

codecov bot commented Jun 23, 2023 •

edited

Loading

rjwills28 commented Jun 23, 2023

AlexanderWells-diamond commented Jun 27, 2023

rjwills28 commented Jun 27, 2023

AlexanderWells-diamond commented Jun 28, 2023

AlexanderWells-diamond left a comment

Fix timing issues in subscription tests #76

Fix timing issues in subscription tests #76

Conversation

rjwills28 commented Jun 21, 2023

AlexanderWells-diamond left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 23, 2023 • edited Loading

Codecov Report

rjwills28 commented Jun 23, 2023

AlexanderWells-diamond commented Jun 27, 2023

rjwills28 commented Jun 27, 2023

AlexanderWells-diamond commented Jun 28, 2023

AlexanderWells-diamond left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 23, 2023 •

edited

Loading