Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trims timestamp from log message if customer enables on logs plugin #1568

Merged
merged 5 commits into from
Feb 26, 2025

Conversation

nathalapooja
Copy link
Contributor

@nathalapooja nathalapooja commented Feb 25, 2025

Description of the issue

Describe the problem or feature in addition to a link to the issues.
CWA should be configurable to trim timestamp from the log message before publishing

Description of changes

How does this change address the problem?
Adds a configuration option to the agent to optionally trim the timestamp from the log message before publishing to CWL.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Describe what tests you have done.
Unit Tests
Manual Test
Config.json

{
  "agent": {
    "run_as_user": "root",
    "debug": true
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log",
            "log_group_name": "agent_log_group",
            "log_stream_name": "agent_log_stream",
            "timezone": "UTC",
            "retention_in_days": 5,
          },
          {
            "file_path": "/tmp/log_2025-02-26.log",
            "log_group_name": "trim_log_group",
            "log_stream_name": "trim_log_stream",
            "timezone": "UTC",
            "retention_in_days": 5,
            "timestamp_format": "%Y-%m-%dT%H:%M:%SZ",
            "trim_timestamp": true
          }
        ]
      }
    },
    "force_flush_interval": 60
  }
}

Toml

[agent]
  collection_jitter = "0s"
  debug = true
  flush_interval = "1s"
  flush_jitter = "0s"
  hostname = ""
  interval = "60s"
  logfile = "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
  logtarget = "lumberjack"
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = false
  precision = ""
  quiet = false
  round_interval = false
  run_as_user = "root"

[inputs]

  [[inputs.logfile]]
    destination = "cloudwatchlogs"
    file_state_folder = "/opt/aws/amazon-cloudwatch-agent/logs/state"

    [[inputs.logfile.file_config]]
      deployment_environment = ""
      file_path = "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
      from_beginning = true
      log_group_class = ""
      log_group_name = "agent_log_group"
      log_stream_name = "agent_log_stream"
      pipe = false
      retention_in_days = 5
      service_name = ""
      timezone = "UTC"

    [[inputs.logfile.file_config]]
      deployment_environment = ""
      file_path = "/tmp/log_2025-02-26.log"
      from_beginning = true
      log_group_class = ""
      log_group_name = "trim_log_group"
      log_stream_name = "trim_log_stream"
      pipe = false
      retention_in_days = 5
      service_name = ""
      timestamp_layout = ["2006-01-_2T15:04:05Z", "2006-1-_2T15:04:05Z"]
      timestamp_regex = "(\\d{4}-\\s{0,1}\\d{1,2}-\\s{0,1}\\d{1,2}T\\d{2}:\\d{2}:\\d{2}Z)"
      timezone = "UTC"
      trim_timestamp = true

[outputs]

  [[outputs.cloudwatchlogs]]
    force_flush_interval = "60s"
    log_stream_name = "i-xxxxxxxxxx"
    mode = "EC2"
    region = "us-east-1"
    region_type = "EC2M"

CloudWatch logs for log group trim_log_group and log stream trim_log_stream

Screenshot 2025-02-26 at 8 58 04 AM

Requirements

Before commit the code, please do the following steps.

  1. Run make fmt and make fmt-sh
  2. Run make lint

@nathalapooja nathalapooja requested a review from a team as a code owner February 25, 2025 19:39
@@ -130,15 +130,34 @@ func TestTimestampParser(t *testing.T) {
expectedTimestamp := time.Unix(1497882318, 0)
timestampString := "19 Jun 2017 14:25:18"
logEntry := fmt.Sprintf("%s [INFO] This is a test message.", timestampString)
timestamp := fileConfig.timestampFromLogLine(logEntry)
timestamp, modifiedLogEntry := fileConfig.timestampFromLogLine(logEntry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: logEntry is a better name since we don't actually modify the content here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the TrimTimestamp flag right. so I just kept it generic

return timestamp
if config.TrimTimestamp {
// Trim the entire timestamp portion (from start to end of the match)
return timestamp, logValue[:index[0]] + logValue[index[1]:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the value of index[0] [index[1]? What are they corresponding to?

Reading the if cases above, it seems like the length can vary depending on the regex so want to make sure we are fine with always calling index 0 and 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex will always write down the index array with index[0] is the start position of the timestamp match
index[1] is the end position of the timestamp match

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also trim any leading whitespace after the timestamp trim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a point against doing this by default, the OTEL filelog receiver has preserve_(leading|trailing)_whitespace options in their config.

type TrimTimestamp struct {
}

func (r *TrimTimestamp) ApplyRule(input interface{}) (returnKey string, returnVal interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address lint issues.

return timestamp
if config.TrimTimestamp {
// Trim the entire timestamp portion (from start to end of the match)
return timestamp, logValue[:index[0]] + logValue[index[1]:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also trim any leading whitespace after the timestamp trim?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants