-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution][Detections] Ignore interim results in ML Rule anomalies query #90316
Conversation
We were incorrectly including records with is_interim: true in our query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then dipped below after it was finalized.
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, The spice must flow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well! Thanks for seeing this one through @rylnd! And all the debugging, and even squeezing some type fixes in here as well too! 😉 🌶️ ⏳
Gonna sneak this blog post about bucket spans in here for anyone passing by -- learned a ton from this post about the interworkings of ML, so definitely worth giving a gander!
https://www.elastic.co/blog/explaining-the-bucket-span-in-machine-learning-for-elasticsearch
expect(filters).toEqual( | ||
expect.arrayContaining([ | ||
{ | ||
term: { | ||
is_interim: false, | ||
}, | ||
}, | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this means that alerts won't fire for anomalies until after the bucket has been completed.
For short buckets, this is cool (like, 15m
). But if there are jobs with 1hr+ length buckets, it means an anomaly won't trigger this alert until after a whole hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to meet with @randomuserid and the @elastic/protections folks to review the existing ML Job bucket times, and corresponding Rule's interval + lookback to ensure we don't have any gaps here. I believe @randomuserid took this all into account when initially developing the jobs/rules, but we'll ensure to do an audit here to make sure everything is 👍.
Thanks again for all your help and feedback around debugging this issue, really grateful for your feedback 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the jobs have 15m buckets. I set the lookbacks are with the assumption that anomaly may take 15-30 minutes to be picked up by the rules depending on where the bucket is when the rule fires. Some, like CloudTrail, have longer lookbacks, because those pipelines have much more variable latency.
@elasticmachine merge upstream |
💛 Build succeeded, but was flaky
Test FailuresKibana Pipeline / general / Firefox XPack UI Functional Tests.x-pack/test/functional/apps/status_page/status_page·ts.Status page Status Page allows user to navigate without authenticationStandard Out
Stack Trace
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
We were incorrectly including records with is_interim: true in our query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then dipped below after it was finalized. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
We were incorrectly including records with is_interim: true in our query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then dipped below after it was finalized. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
We were incorrectly including records with is_interim: true in our query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then dipped below after it was finalized. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
We were incorrectly including records with is_interim: true in our query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then dipped below after it was finalized. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Summary
We were incorrectly including records with
is_interim: true
in our anomalies query, which lead to false positive signals if the rule executed while an anomaly's score was (temporarily) above the specified threshold, but then drops below after re-evaluation. Whilerecord_score
may continue to be re-evaluated afteris_interim: false
is set, any changes at that point would be in reaction to new anomalous data, which would have its own alert.Checklist
Delete any items that are not applicable to this PR.
For maintainers