[Cloud Security] fix rules flaky test suite #212198

alexreal1314 · 2025-02-24T07:06:25Z

Summary

This PR fixes the flakiness of of rules page test suite - #178413
Issues there were handled are:

Removing the set up of fleet server before every test which caused flakiness.
Fixing race conditions.
Use retry.try function to re-execute in places where flakiness was observed due to not waiting enough time before doing the action.

Checklist

Reviewers should verify this PR satisfies this list as well.

Flaky Test Runner was used on any tests changed
The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

kibanamachine · 2025-02-24T07:21:08Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7928

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T07:37:26Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7930

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T07:55:12Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7931

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T08:28:36Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7932

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T08:40:17Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7933

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T08:57:58Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7934

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-24T11:38:48Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7936

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

kibanamachine · 2025-02-24T13:00:22Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7937

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

elasticmachine · 2025-02-24T15:27:39Z

Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security)

seanrathier · 2025-02-24T15:38:19Z

x-pack/test/cloud_security_posture_functional/pages/rules.ts

The Rules page will be tough to keep from being flaky. There are quite a few places on the Rules page where we can improve performance, for example, fetching all the rules to show the updated count could be changed to use and API that returns the counts, but we would need to make significant changes. This not part of your task.

🤞 this works for a while.

Interesting, maybe it can be a separate task to optimize performance.
Hopefully we can keep it stable so we have coverage for that part.

Yes, there are some tests that should be migrated to unit / integration, here's some examples:

it('Clicking the posture score button leads to the dashboard', async () => { it('Shows integrations count when there are findings', async () => { it('Clicking the integrations counter button leads to the integration page', async () => { it('Shows the failed findings counter when there are findings', async () => {

I think we could open a follow-up ticket for that?

Also by judging on the number of tests, there will be several tests even after migrating some to the unit, I think we should start considering splitting tests into subfolders, i.e instead of having everything related to the rules page in a single file, we could have one rules folder, with files for flyout, counters and bulk actions. This could help ensure that we don't lose coverage of an entire functionality because of a single flaky test.

JordanSh · 2025-02-24T15:30:08Z

x-pack/test/cloud_security_posture_functional/page_objects/rule_page.ts

+      await retry.try(async () => {
+        const integrationsEvaluatedButton = await testSubjects.find(
+          'rules-counters-integrations-evaluated-button'
+        );
+        // Check that href exists and is not empty
+        const href = await integrationsEvaluatedButton.getAttribute('href');
+        if (!href) {
+          throw new Error('Integration link is not ready yet - href is empty');
+        }
+        await integrationsEvaluatedButton.click();
+      });


JordanSh · 2025-02-24T15:38:34Z

x-pack/test/cloud_security_posture_functional/pages/rules.ts

    this.tags(['cloud_security_posture_rules_page']);
    let rule: typeof pageObjects.rule;
    let findings: typeof pageObjects.findings;
    let agentPolicyId: string;

-    beforeEach(async () => {
+    before(async () => {


While our general approach is to ensure a clean state for each test, specifically in the case of FTRs this can sometimes lead to flakiness. By maintaining a clean state for each test suite, we ensure that each test runs independently, and if a failure occurs, it’s easier to pinpoint the root cause.

After this change, if one or more tests fail, it could be due to a lack of proper cleanup from a previous test. This creates additional complexity in identifying the true source of failure and can lead to flakiness as well.

Not sure if this aligns with the overall strategy, even though I'm personally fine with it for FTR in particular.

It would be beneficial to get input from others who have a stronger perspective on the matter. @maxcold @kfirpeled @seanrathier @opauloh

@JordanSh From what i saw it didn't affect the tests when changing it to a single initialization.

yeah, but that depends on the tests, we probably didn't have a test suit that actually rely on a clean state in order to function in this particular case, but if we had it would fail now.
i was mentioning that because previously we had a discussion on these two approaches, not sure what was the final decision. I don't have a strong opinion on this subject, just wanted to get some more attentions to this

While our general approach is to ensure a clean state for each test

IMHO,

This raises an important question about E2E tests. While we reset mocks for unit tests, should we also clean up the environment for E2E tests?

On one hand, cleaning up data ensures we're always testing with a clean slate, which is beneficial for the test's integrity. On the other hand, the most common user experience involves interacting with pre-existing data, so we need to consider how this affects the realism of our tests.

All of that said will not make sense if FTRs are running in parallel, which they are not AFAIK.

Agree that in some cases you need to reset some conditions/data before each test suite as I added closing of flyout after some tests and reseting the enable/disable status but in this case this data is shared across test cases and from my perspective should be initialized once.

IMO, having initializers on beforeEach and cleanups on afterEach it's the safest approach, although, some exceptions can be made for expensive initializers when applicable, however, I don't think this case is applicable for the entire test file: some of the tests are actually mutating the saved objects (enabling/disabling rules, creating detection rules), so it hurts one test automation principle that is to not have tests that can fail/pass according to its order.

I'm attaching here the video running a portion of the tests:

Screen.Recording.2025-02-24.at.4.03.50.PM.mov

We can see the initial state is at 33% posture score as the first test asserts for it expect((await postureScoreCounter.getVisibleText()).includes('33%')).to.be(true);, at some point on the "Shows the disabled rules count" test, some rules are disabled making the posture score at 0%.

If those tests were swapped here the tests would already start to fail. And then as we can see in the recording, once the test 'Shows empty state when there are no findings' runs, it removes all the findings at a certain point by calling await findings.index.remove();, that means from that test and beyond there's no posture score any more as they rely on findings (we can see that the get started with KSPM component starts to appear, changing the desired initial state).

So the entire test suite is passing now, but if the order of some tests changes it breaks, if someone tries to add a new test without knowing what state the application encounters, it can break. It doesn't seem much reliable IMO.

I think we can start considering some actions that can help with the maintenance of the test automation for the rules feature:

splitting those tests into separate files (as the more tests we have, the more is the potential foflakinessss, and splitting helps prevent losing the entire feature coverage if one test fails)

We can keep the integration installation in the before and it's cleanup on after (as it's not modified by the test), however on the beforeEach and afterEach we should ensure the findings added and cleaned (as it modified parts of tests) and also instead of calling the kibanaServer.savedObjects.cleanStandardList() which is very expensive, we can add an utility to only delete the saved objects that store the rules (csp-internal-settings) as that would ensure clean state related to rules on every run

agree with Paulo overall. What also happened to us quite often is that one test case starts being flaky, appex team skips it for us as it is blocking other teams. As other tests depend on the results of this skipped tests, other tests start to fail and platform team ends up skipping the whole suit in the end.
The dependency between tests in e2e is a tricky topic, it's also fine to have fewer atomic tests and test whole flows in one test case instead to avoid dependency between test cases.

opauloh · 2025-02-24T23:21:44Z

x-pack/test/cloud_security_posture_functional/pages/rules.ts

Yes, there are some tests that should be migrated to unit / integration, here's some examples:

it('Clicking the posture score button leads to the dashboard', async () => { it('Shows integrations count when there are findings', async () => { it('Clicking the integrations counter button leads to the integration page', async () => { it('Shows the failed findings counter when there are findings', async () => {

I think we could open a follow-up ticket for that?

opauloh · 2025-02-24T23:32:01Z

x-pack/test/cloud_security_posture_functional/pages/rules.ts

Also by judging on the number of tests, there will be several tests even after migrating some to the unit, I think we should start considering splitting tests into subfolders, i.e instead of having everything related to the rules page in a single file, we could have one rules folder, with files for flyout, counters and bulk actions. This could help ensure that we don't lose coverage of an entire functionality because of a single flaky test.

opauloh · 2025-02-25T00:31:47Z

x-pack/test/cloud_security_posture_functional/pages/rules.ts

    this.tags(['cloud_security_posture_rules_page']);
    let rule: typeof pageObjects.rule;
    let findings: typeof pageObjects.findings;
    let agentPolicyId: string;

-    beforeEach(async () => {
+    before(async () => {


IMO, having initializers on beforeEach and cleanups on afterEach it's the safest approach, although, some exceptions can be made for expensive initializers when applicable, however, I don't think this case is applicable for the entire test file: some of the tests are actually mutating the saved objects (enabling/disabling rules, creating detection rules), so it hurts one test automation principle that is to not have tests that can fail/pass according to its order.

I'm attaching here the video running a portion of the tests:

Screen.Recording.2025-02-24.at.4.03.50.PM.mov

We can see the initial state is at 33% posture score as the first test asserts for it expect((await postureScoreCounter.getVisibleText()).includes('33%')).to.be(true);, at some point on the "Shows the disabled rules count" test, some rules are disabled making the posture score at 0%.

If those tests were swapped here the tests would already start to fail. And then as we can see in the recording, once the test 'Shows empty state when there are no findings' runs, it removes all the findings at a certain point by calling await findings.index.remove();, that means from that test and beyond there's no posture score any more as they rely on findings (we can see that the get started with KSPM component starts to appear, changing the desired initial state).

So the entire test suite is passing now, but if the order of some tests changes it breaks, if someone tries to add a new test without knowing what state the application encounters, it can break. It doesn't seem much reliable IMO.

I think we can start considering some actions that can help with the maintenance of the test automation for the rules feature:

splitting those tests into separate files (as the more tests we have, the more is the potential foflakinessss, and splitting helps prevent losing the entire feature coverage if one test fails)

We can keep the integration installation in the before and it's cleanup on after (as it's not modified by the test), however on the beforeEach and afterEach we should ensure the findings added and cleaned (as it modified parts of tests) and also instead of calling the kibanaServer.savedObjects.cleanStandardList() which is very expensive, we can add an utility to only delete the saved objects that store the rules (csp-internal-settings) as that would ensure clean state related to rules on every run

kibanamachine · 2025-02-25T14:44:37Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7944

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

kibanamachine · 2025-02-25T15:26:16Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7945

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

alexreal1314 · 2025-02-25T15:35:27Z

@maxcold @JordanSh @opauloh @seanrathier thanks everyone, I did some changes regarding this test suite:

split rules test suite into three - rules counters, headers and table.
update 'before' hooks - only clean necessary saved objects types before each test suite.
cleanup findings index before each test case.

opauloh

Thanks addressing the changes and splitting it, I just got one more question

opauloh · 2025-02-25T16:55:35Z

x-pack/test/cloud_security_posture_functional/pages/rules/rules_counters.ts

+      await kibanaServer.savedObjects.clean({
+        types: [
+          'ingest-agent-policies',
+          'fleet-agent-policies',
+          'ingest-package-policies',
+          'fleet-package-policies',
+        ],
+      });


Nice! can we also include the csp-internal-settings saved object?

kibanamachine · 2025-02-26T00:30:16Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7953

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

kibanamachine · 2025-02-26T00:46:58Z

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7955

[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed.

see run history

kibanamachine · 2025-02-26T01:05:16Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7954

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

kibanamachine · 2025-02-26T01:26:17Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7956

[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed.

see run history

…d table sections updated before and after hooks

elasticmachine · 2025-02-26T10:04:12Z

💚 Build Succeeded

Buildkite Build
Commit: a0a60fa

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id	before	after	diff
`@kbn/test-suites-xpack`	721	723	+2

Total ESLint disabled count

id	before	after	diff
`@kbn/test-suites-xpack`	747	749	+2

History

💛 Build #279298 was flaky 235d386
💔 Build #279292 failed de06c57
💔 Build #279249 failed 6d5c075
💛 Build #279181 was flaky 5559202
💚 Build #279134 succeeded 77fd82f
💚 Build #278816 succeeded 192cc07

cc @alexreal1314

kibanamachine · 2025-02-26T10:06:30Z

Starting backport for target branches: 9.0

https://github.com/elastic/kibana/actions/runs/13541347379

## Summary This PR fixes the flakiness of of rules page test suite - elastic#178413 Issues there were handled are: 1. Removing the set up of fleet server before every test which caused flakiness. ![Screenshot 2025-02-24 at 9 54 40](https://github.com/user-attachments/assets/6913be9d-45aa-46fa-9923-671ed0d67f98) 2. Fixing race conditions. 3. Use retry.try function to re-execute in places where flakiness was observed due to not waiting enough time before doing the action. ### Checklist Reviewers should verify this PR satisfies this list as well. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) (cherry picked from commit 14e7f60)

kibanamachine · 2025-02-26T10:11:52Z

💚 All backports created successfully

Status	Branch	Result
✅	9.0

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

# Backport This will backport the following commits from `main` to `9.0`: - [[Cloud Security] fix rules flaky test suite (#212198)](#212198)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  Co-authored-by: Alex Prozorov <alex.prozorov@elastic.co>

alexreal1314 force-pushed the 178413-fix-rules-tests branch from a711770 to 770900c Compare February 24, 2025 07:24

alexreal1314 force-pushed the 178413-fix-rules-tests branch from 770900c to 97208cc Compare February 24, 2025 07:42

alexreal1314 force-pushed the 178413-fix-rules-tests branch from 97208cc to d8021a3 Compare February 24, 2025 08:15

alexreal1314 force-pushed the 178413-fix-rules-tests branch from d8021a3 to 22a032c Compare February 24, 2025 08:40

alexreal1314 force-pushed the 178413-fix-rules-tests branch from e485fb0 to 45b7c37 Compare February 24, 2025 12:14

alexreal1314 self-assigned this Feb 24, 2025

alexreal1314 added Team:Cloud Security Cloud Security team related release_note:skip Skip the PR/issue when compiling release notes labels Feb 24, 2025

alexreal1314 changed the title ~~fix rules flaky test suite~~ [Cloud Security] fix rules flaky test suite Feb 24, 2025

alexreal1314 added the backport:skip This commit does not require backporting label Feb 24, 2025

alexreal1314 marked this pull request as ready for review February 24, 2025 15:27

alexreal1314 requested a review from a team as a code owner February 24, 2025 15:27

seanrathier approved these changes Feb 24, 2025

View reviewed changes

JordanSh reviewed Feb 24, 2025

View reviewed changes

opauloh requested changes Feb 25, 2025

View reviewed changes

opauloh reviewed Feb 25, 2025

View reviewed changes

alexreal1314 added the backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) label Feb 25, 2025

alexreal1314 force-pushed the 178413-fix-rules-tests branch from a6a1ddd to aab6485 Compare February 25, 2025 23:47

alexreal1314 force-pushed the 178413-fix-rules-tests branch 2 times, most recently from 2c6de46 to 235d386 Compare February 26, 2025 00:45

alexreal1314 added 10 commits February 26, 2025 10:13

fix rules flaky test suite

4320831

add log

6cf3ad7

update clickIntegrationsEvaluatedButton function

e4bfdbb

removed logs and add comments

0e17bb5

split rules page test suite into 3 test suites - counters, headers an…

bfa0440

…d table sections updated before and after hooks

rename rules test tags

0f767c6

update saved objects clean up logic in the rules test suites

457807b

add closing of toast popovers

dc81796

test

623bfaf

update before hooks

a0a60fa

alexreal1314 force-pushed the 178413-fix-rules-tests branch from 235d386 to a0a60fa Compare February 26, 2025 08:14

alexreal1314 removed the backport:prev-major Backport to (8.x, 8.18, 8.17, 8.16) the previous major branch and other branches in development label Feb 26, 2025

alexreal1314 removed the backport:skip This commit does not require backporting label Feb 26, 2025

alexreal1314 merged commit 14e7f60 into main Feb 26, 2025
11 checks passed

alexreal1314 deleted the 178413-fix-rules-tests branch February 26, 2025 10:05

kibanamachine added the v9.1.0 label Feb 26, 2025

kibanamachine mentioned this pull request Feb 26, 2025

[9.0] [Cloud Security] fix rules flaky test suite (#212198) #212493

Merged

alexreal1314 added the v9.0.0 label Feb 26, 2025

[Cloud Security] fix rules flaky test suite #212198

[Cloud Security] fix rules flaky test suite #212198

Conversation

alexreal1314 commented Feb 24, 2025 • edited by kibanamachine Loading

Summary

Checklist

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7928

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7930

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7931

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7932

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7933

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7934

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7936

kibanamachine commented Feb 24, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7937

elasticmachine commented Feb 24, 2025

seanrathier Feb 24, 2025 • edited Loading

Choose a reason for hiding this comment

alexreal1314 Feb 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanrathier Feb 24, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kibanamachine commented Feb 25, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7944

kibanamachine commented Feb 25, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7945

alexreal1314 commented Feb 25, 2025 • edited Loading

opauloh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kibanamachine commented Feb 26, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7953

kibanamachine commented Feb 26, 2025

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#7955

kibanamachine commented Feb 26, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7954

kibanamachine commented Feb 26, 2025

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7956

elasticmachine commented Feb 26, 2025

💚 Build Succeeded

Metrics [docs]

ESLint disabled line counts

Total ESLint disabled count

History

kibanamachine commented Feb 26, 2025

kibanamachine commented Feb 26, 2025

💚 All backports created successfully

alexreal1314 commented Feb 24, 2025 •

edited by kibanamachine

Loading

seanrathier Feb 24, 2025 •

edited

Loading

alexreal1314 Feb 24, 2025 •

edited

Loading

seanrathier Feb 24, 2025 •

edited

Loading

alexreal1314 commented Feb 25, 2025 •

edited

Loading