Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gcp_stackdriver_logs sink): Support additional top level labels for gcp_stackdriver_logs sink #22473

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

stackempty
Copy link

@stackempty stackempty commented Feb 19, 2025

Summary

This adds top level label field according to LogEntry docs https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#FIELDS.labels!
This will allow users add additional labels to the label field in log entries not just the ones in resource object!

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

  • Ran the unit tests
    cargo test --package vector --lib -- sinks::gcp::stackdriver::logs::tests --show-output```
  • cargo test --all
  • Tested against an existing google account and viewed the logs in GCL.

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

References

Closes: #8516

@stackempty stackempty requested review from a team as code owners February 19, 2025 13:35
@bits-bot
Copy link

bits-bot commented Feb 19, 2025

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added domain: sinks Anything related to the Vector's sinks domain: external docs Anything related to Vector's external, public documentation labels Feb 19, 2025
@stackempty stackempty changed the title feat!(gcp_stackdriver_logs sink): Support additional top level labels feat!(gcp_stackdriver_logs sink): Support additional top level labels for gcp_stackdriver_logs sink Feb 19, 2025
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @stackempty!

@pront pront changed the title feat!(gcp_stackdriver_logs sink): Support additional top level labels for gcp_stackdriver_logs sink feat(gcp_stackdriver_logs sink): Support additional top level labels for gcp_stackdriver_logs sink Feb 20, 2025
@fbs
Copy link

fbs commented Feb 20, 2025

I noticed your MR. I just opened a similar one the other day (#22470) with a different approach.

If I understand this change correctly, it means you have to put the labels you want to send into the config file, right? My concern with this is that it removes too much of the flexibility labels should give you. You cannot use arbitrary labels anymore but instead have to define everything you want iin the config. This is probably fine if you ship directly to GCP, but if you want to go through an aggregator that becomes impossible to do.
The 'resource.labels' setup also has that issue, but its not that bad as theres only a few types that make sense, and types dictate the resource.labels options

@stackempty stackempty force-pushed the dami/feat/add-stackdriver-labels branch from 2a036ab to 6b76d49 Compare February 20, 2025 17:43
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @stackempty!

@pront
Copy link
Member

pront commented Feb 20, 2025

I noticed your MR. I just opened a similar one the other day (#22470) with a different approach.

If I understand this change correctly, it means you have to put the labels you want to send into the config file, right? My concern with this is that it removes too much of the flexibility labels should give you. You cannot use arbitrary labels anymore but instead have to define everything you want iin the config. This is probably fine if you ship directly to GCP, but if you want to go through an aggregator that becomes impossible to do. The 'resource.labels' setup also has that issue, but its not that bad as theres only a few types that make sense, and types dictate the resource.labels options

Hi @fbs, this PR introduces template-able labels, which means they can be populated from events dynamically (or statically in the config). Does this solve the issues with your use case?

@fbs
Copy link

fbs commented Feb 20, 2025

I noticed your MR. I just opened a similar one the other day (#22470) with a different approach.
If I understand this change correctly, it means you have to put the labels you want to send into the config file, right? My concern with this is that it removes too much of the flexibility labels should give you. You cannot use arbitrary labels anymore but instead have to define everything you want iin the config. This is probably fine if you ship directly to GCP, but if you want to go through an aggregator that becomes impossible to do. The 'resource.labels' setup also has that issue, but its not that bad as theres only a few types that make sense, and types dictate the resource.labels options

Hi @fbs, this PR introduces template-able labels, which means they can be populated from events dynamically (or statically in the config). Does this solve the issues with your use case?

In this case only label values are templates, right? The key is still specified by the config. And any key must resolve to a value, even if its just an empty string.

My concern is that there are different label keys that make sense for different log entries. E.g. if the log is processed from a kubernetes pod you'd have labels.pod_name, but if its a vm you could have labels.vm_name.
For a central aggregator that processes both types neither is great. Either you have labels with empty values (but with key), or have N sinks

@stackempty stackempty force-pushed the dami/feat/add-stackdriver-labels branch from 49b2799 to 6f4e6ee Compare February 25, 2025 12:45
@stackempty
Copy link
Author

stackempty commented Feb 25, 2025

I noticed your MR. I just opened a similar one the other day (#22470) with a different approach.
If I understand this change correctly, it means you have to put the labels you want to send into the config file, right? My concern with this is that it removes too much of the flexibility labels should give you. You cannot use arbitrary labels anymore but instead have to define everything you want iin the config. This is probably fine if you ship directly to GCP, but if you want to go through an aggregator that becomes impossible to do. The 'resource.labels' setup also has that issue, but its not that bad as theres only a few types that make sense, and types dictate the resource.labels options

Hi @fbs, this PR introduces template-able labels, which means they can be populated from events dynamically (or statically in the config). Does this solve the issues with your use case?

In this case only label values are templates, right? The key is still specified by the config. And any key must resolve to a value, even if its just an empty string.

My concern is that there are different label keys that make sense for different log entries. E.g. if the log is processed from a kubernetes pod you'd have labels.pod_name, but if its a vm you could have labels.vm_name. For a central aggregator that processes both types neither is great. Either you have labels with empty values (but with key), or have N sinks

Hi @fbs,

Thanks for sharing your concern! I have made updates to the PR I believe should address them.

The labels configuration item would remain. This is the standard even with google's own logging agent documented here. Labels specified here will go into the labels field of the LogEntry object.

labels_key option has been added - This will allow folks specify a key from the structured log from which they will be merged into the labels field of the LogEntry object. The default is logging.googleapis.com/labels. This behavior is similar to the google logging agent and fluentbit's stackdriver output plugin.

@stackempty stackempty force-pushed the dami/feat/add-stackdriver-labels branch 2 times, most recently from 0145c14 to aa2e395 Compare February 25, 2025 13:45
@stackempty stackempty requested a review from pront February 25, 2025 13:53
@stackempty
Copy link
Author

@pront would you please give this another look! Thanks!

@stackempty stackempty force-pushed the dami/feat/add-stackdriver-labels branch from aa2e395 to b8f469a Compare February 26, 2025 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support user-defined labels in gcp_stackdriver_logs sink
5 participants