-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cloud Security] fix rules flaky test suite #212198
Conversation
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7928[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
a711770
to
770900c
Compare
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7930[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
770900c
to
97208cc
Compare
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7931[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
97208cc
to
d8021a3
Compare
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7932[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
d8021a3
to
22a032c
Compare
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7933[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7934[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7936[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
e485fb0
to
45b7c37
Compare
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7937[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Rules page will be tough to keep from being flaky. There are quite a few places on the Rules page where we can improve performance, for example, fetching all the rules to show the updated count could be changed to use and API that returns the counts, but we would need to make significant changes. This not part of your task.
🤞 this works for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, maybe it can be a separate task to optimize performance.
Hopefully we can keep it stable so we have coverage for that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are some tests that should be migrated to unit / integration, here's some examples:
it('Clicking the posture score button leads to the dashboard', async () => {
it('Shows integrations count when there are findings', async () => {
it('Clicking the integrations counter button leads to the integration page', async () => {
it('Shows the failed findings counter when there are findings', async () => {
I think we could open a follow-up ticket for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also by judging on the number of tests, there will be several tests even after migrating some to the unit, I think we should start considering splitting tests into subfolders, i.e instead of having everything related to the rules page in a single file, we could have one rules folder, with files for flyout, counters and bulk actions. This could help ensure that we don't lose coverage of an entire functionality because of a single flaky test.
await retry.try(async () => { | ||
const integrationsEvaluatedButton = await testSubjects.find( | ||
'rules-counters-integrations-evaluated-button' | ||
); | ||
// Check that href exists and is not empty | ||
const href = await integrationsEvaluatedButton.getAttribute('href'); | ||
if (!href) { | ||
throw new Error('Integration link is not ready yet - href is empty'); | ||
} | ||
await integrationsEvaluatedButton.click(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
this.tags(['cloud_security_posture_rules_page']); | ||
let rule: typeof pageObjects.rule; | ||
let findings: typeof pageObjects.findings; | ||
let agentPolicyId: string; | ||
|
||
beforeEach(async () => { | ||
before(async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While our general approach is to ensure a clean state for each test, specifically in the case of FTRs this can sometimes lead to flakiness. By maintaining a clean state for each test suite, we ensure that each test runs independently, and if a failure occurs, it’s easier to pinpoint the root cause.
After this change, if one or more tests fail, it could be due to a lack of proper cleanup from a previous test. This creates additional complexity in identifying the true source of failure and can lead to flakiness as well.
Not sure if this aligns with the overall strategy, even though I'm personally fine with it for FTR in particular.
It would be beneficial to get input from others who have a stronger perspective on the matter. @maxcold @kfirpeled @seanrathier @opauloh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JordanSh From what i saw it didn't affect the tests when changing it to a single initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but that depends on the tests, we probably didn't have a test suit that actually rely on a clean state in order to function in this particular case, but if we had it would fail now.
i was mentioning that because previously we had a discussion on these two approaches, not sure what was the final decision. I don't have a strong opinion on this subject, just wanted to get some more attentions to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While our general approach is to ensure a clean state for each test
IMHO,
This raises an important question about E2E tests. While we reset mocks for unit tests, should we also clean up the environment for E2E tests?
On one hand, cleaning up data ensures we're always testing with a clean slate, which is beneficial for the test's integrity. On the other hand, the most common user experience involves interacting with pre-existing data, so we need to consider how this affects the realism of our tests.
All of that said will not make sense if FTRs are running in parallel, which they are not AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that in some cases you need to reset some conditions/data before each test suite as I added closing of flyout after some tests and reseting the enable/disable status but in this case this data is shared across test cases and from my perspective should be initialized once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, having initializers on beforeEach
and cleanups on afterEach
it's the safest approach, although, some exceptions can be made for expensive initializers when applicable, however, I don't think this case is applicable for the entire test file: some of the tests are actually mutating the saved objects (enabling/disabling rules, creating detection rules), so it hurts one test automation principle that is to not have tests that can fail/pass according to its order.
I'm attaching here the video running a portion of the tests:
Screen.Recording.2025-02-24.at.4.03.50.PM.mov
We can see the initial state is at 33% posture score as the first test asserts for it expect((await postureScoreCounter.getVisibleText()).includes('33%')).to.be(true);
, at some point on the "Shows the disabled rules count" test, some rules are disabled making the posture score at 0%.
If those tests were swapped here the tests would already start to fail. And then as we can see in the recording, once the test 'Shows empty state when there are no findings' runs, it removes all the findings at a certain point by calling await findings.index.remove();
, that means from that test and beyond there's no posture score any more as they rely on findings (we can see that the get started with KSPM component starts to appear, changing the desired initial state).
So the entire test suite is passing now, but if the order of some tests changes it breaks, if someone tries to add a new test without knowing what state the application encounters, it can break. It doesn't seem much reliable IMO.
I think we can start considering some actions that can help with the maintenance of the test automation for the rules feature:
- splitting those tests into separate files (as the more tests we have, the more is the potential foflakinessss, and splitting helps prevent losing the entire feature coverage if one test fails)
- We can keep the integration installation in the
before
and it's cleanup onafter
(as it's not modified by the test), however on thebeforeEach
andafterEach
we should ensure the findings added and cleaned (as it modified parts of tests) and also instead of calling thekibanaServer.savedObjects.cleanStandardList()
which is very expensive, we can add an utility to only delete the saved objects that store the rules (csp-internal-settings
) as that would ensure clean state related to rules on every run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with Paulo overall. What also happened to us quite often is that one test case starts being flaky, appex team skips it for us as it is blocking other teams. As other tests depend on the results of this skipped tests, other tests start to fail and platform team ends up skipping the whole suit in the end.
The dependency between tests in e2e is a tricky topic, it's also fine to have fewer atomic tests and test whole flows in one test case instead to avoid dependency between test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are some tests that should be migrated to unit / integration, here's some examples:
it('Clicking the posture score button leads to the dashboard', async () => {
it('Shows integrations count when there are findings', async () => {
it('Clicking the integrations counter button leads to the integration page', async () => {
it('Shows the failed findings counter when there are findings', async () => {
I think we could open a follow-up ticket for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also by judging on the number of tests, there will be several tests even after migrating some to the unit, I think we should start considering splitting tests into subfolders, i.e instead of having everything related to the rules page in a single file, we could have one rules folder, with files for flyout, counters and bulk actions. This could help ensure that we don't lose coverage of an entire functionality because of a single flaky test.
this.tags(['cloud_security_posture_rules_page']); | ||
let rule: typeof pageObjects.rule; | ||
let findings: typeof pageObjects.findings; | ||
let agentPolicyId: string; | ||
|
||
beforeEach(async () => { | ||
before(async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, having initializers on beforeEach
and cleanups on afterEach
it's the safest approach, although, some exceptions can be made for expensive initializers when applicable, however, I don't think this case is applicable for the entire test file: some of the tests are actually mutating the saved objects (enabling/disabling rules, creating detection rules), so it hurts one test automation principle that is to not have tests that can fail/pass according to its order.
I'm attaching here the video running a portion of the tests:
Screen.Recording.2025-02-24.at.4.03.50.PM.mov
We can see the initial state is at 33% posture score as the first test asserts for it expect((await postureScoreCounter.getVisibleText()).includes('33%')).to.be(true);
, at some point on the "Shows the disabled rules count" test, some rules are disabled making the posture score at 0%.
If those tests were swapped here the tests would already start to fail. And then as we can see in the recording, once the test 'Shows empty state when there are no findings' runs, it removes all the findings at a certain point by calling await findings.index.remove();
, that means from that test and beyond there's no posture score any more as they rely on findings (we can see that the get started with KSPM component starts to appear, changing the desired initial state).
So the entire test suite is passing now, but if the order of some tests changes it breaks, if someone tries to add a new test without knowing what state the application encounters, it can break. It doesn't seem much reliable IMO.
I think we can start considering some actions that can help with the maintenance of the test automation for the rules feature:
- splitting those tests into separate files (as the more tests we have, the more is the potential foflakinessss, and splitting helps prevent losing the entire feature coverage if one test fails)
- We can keep the integration installation in the
before
and it's cleanup onafter
(as it's not modified by the test), however on thebeforeEach
andafterEach
we should ensure the findings added and cleaned (as it modified parts of tests) and also instead of calling thekibanaServer.savedObjects.cleanStandardList()
which is very expensive, we can add an utility to only delete the saved objects that store the rules (csp-internal-settings
) as that would ensure clean state related to rules on every run
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7944[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7945[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
@maxcold @JordanSh @opauloh @seanrathier thanks everyone, I did some changes regarding this test suite:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks addressing the changes and splitting it, I just got one more question
await kibanaServer.savedObjects.clean({ | ||
types: [ | ||
'ingest-agent-policies', | ||
'fleet-agent-policies', | ||
'ingest-package-policies', | ||
'fleet-package-policies', | ||
], | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! can we also include the csp-internal-settings
saved object?
a6a1ddd
to
aab6485
Compare
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7953[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
2c6de46
to
235d386
Compare
Flaky Test Runner Stats🟠 Some tests failed. - kibana-flaky-test-suite-runner#7955[❌] x-pack/test/cloud_security_posture_functional/config.ts: 0/25 tests passed. |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7954[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7956[✅] x-pack/test/cloud_security_posture_functional/config.ts: 25/25 tests passed. |
…d table sections updated before and after hooks
235d386
to
a0a60fa
Compare
💚 Build Succeeded
Metrics [docs]Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
|
Starting backport for target branches: 9.0 https://github.com/elastic/kibana/actions/runs/13541347379 |
## Summary This PR fixes the flakiness of of rules page test suite - elastic#178413 Issues there were handled are: 1. Removing the set up of fleet server before every test which caused flakiness.  2. Fixing race conditions. 3. Use retry.try function to re-execute in places where flakiness was observed due to not waiting enough time before doing the action. ### Checklist Reviewers should verify this PR satisfies this list as well. - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) (cherry picked from commit 14e7f60)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `9.0`: - [[Cloud Security] fix rules flaky test suite (#212198)](#212198) <!--- Backport version: 9.6.6 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Alex Prozorov","email":"alex.prozorov@elastic.co"},"sourceCommit":{"committedDate":"2025-02-26T10:05:55Z","message":"[Cloud Security] fix rules flaky test suite (#212198)\n\n## Summary\n\nThis PR fixes the flakiness of of rules page test suite -\nhttps://github.com//issues/178413\nIssues there were handled are:\n1. Removing the set up of fleet server before every test which caused\nflakiness.\n\n\n\n2. Fixing race conditions.\n3. Use retry.try function to re-execute in places where flakiness was\nobserved due to not waiting enough time before doing the action.\n\n### Checklist\n\nReviewers should verify this PR satisfies this list as well.\n- [x] [Flaky Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\nused on any tests changed\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"14e7f6007e26bbdac23f65840747e49ba22581ea","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Cloud Security","backport:prev-minor","v9.1.0"],"title":"[Cloud Security] fix rules flaky test suite","number":212198,"url":"https://github.com/elastic/kibana/pull/212198","mergeCommit":{"message":"[Cloud Security] fix rules flaky test suite (#212198)\n\n## Summary\n\nThis PR fixes the flakiness of of rules page test suite -\nhttps://github.com//issues/178413\nIssues there were handled are:\n1. Removing the set up of fleet server before every test which caused\nflakiness.\n\n\n\n2. Fixing race conditions.\n3. Use retry.try function to re-execute in places where flakiness was\nobserved due to not waiting enough time before doing the action.\n\n### Checklist\n\nReviewers should verify this PR satisfies this list as well.\n- [x] [Flaky Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\nused on any tests changed\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"14e7f6007e26bbdac23f65840747e49ba22581ea"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/212198","number":212198,"mergeCommit":{"message":"[Cloud Security] fix rules flaky test suite (#212198)\n\n## Summary\n\nThis PR fixes the flakiness of of rules page test suite -\nhttps://github.com//issues/178413\nIssues there were handled are:\n1. Removing the set up of fleet server before every test which caused\nflakiness.\n\n\n\n2. Fixing race conditions.\n3. Use retry.try function to re-execute in places where flakiness was\nobserved due to not waiting enough time before doing the action.\n\n### Checklist\n\nReviewers should verify this PR satisfies this list as well.\n- [x] [Flaky Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\nused on any tests changed\n- [x] The PR description includes the appropriate Release Notes section,\nand the correct `release_note:*` label is applied per the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"14e7f6007e26bbdac23f65840747e49ba22581ea"}}]}] BACKPORT--> Co-authored-by: Alex Prozorov <alex.prozorov@elastic.co>
Summary
This PR fixes the flakiness of of rules page test suite - #178413
Issues there were handled are:
Removing the set up of fleet server before every test which caused flakiness.

Fixing race conditions.
Use retry.try function to re-execute in places where flakiness was observed due to not waiting enough time before doing the action.
Checklist
Reviewers should verify this PR satisfies this list as well.
release_note:*
label is applied per the guidelines