Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trims timestamp from log message if customer enables on logs plugin #1568

Merged
merged 5 commits into from
Feb 26, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions plugins/inputs/logfile/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ The plugin expects messages in one of the
timestamp_regex = "^(\\d{2} \\w{3} \\d{4} \\d{2}:\\d{2}:\\d{2}).*$"
timestamp_layout = ["_2 Jan 2006 15:04:05"]
timezone = "UTC"
trim_timestamp = false
multi_line_start_pattern = "{timestamp_regex}"
## Read file from beginning.
from_beginning = false
Expand All @@ -65,6 +66,7 @@ The plugin expects messages in one of the
timestamp_regex = "^(\\d{2} \\w{3} \\d{4} \\d{2}:\\d{2}:\\d{2}).*$"
timestamp_layout = ["_2 Jan 2006 15:04:05"]
timezone = "UTC"
trim_timestamp = true
multi_line_start_pattern = "{timestamp_regex}"
## Read file from beginning.
from_beginning = false
Expand Down
16 changes: 11 additions & 5 deletions plugins/inputs/logfile/fileconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ type FileConfig struct {
TimestampLayout []string `toml:"timestamp_layout"`
//The time zone used to parse the timestampFromLogLine in the log entry.
Timezone string `toml:"timezone"`
//Trim timestamp from log line
TrimTimestamp bool `toml:"trim_timestamp"`

//Indicate whether it is a start of multiline.
//If this config is not present, it means the multiline mode is disabled.
Expand Down Expand Up @@ -171,9 +173,9 @@ func (config *FileConfig) init() error {
// Try to parse the timestampFromLogLine value from the log entry line.
// The parser logic will be based on the timestampFromLogLine regex, and time zone info.
// If the parsing operation encounters any issue, int64(0) is returned.
func (config *FileConfig) timestampFromLogLine(logValue string) time.Time {
func (config *FileConfig) timestampFromLogLine(logValue string) (time.Time, string) {
if config.TimestampRegexP == nil {
return time.Time{}
return time.Time{}, logValue
}
index := config.TimestampRegexP.FindStringSubmatchIndex(logValue)
if len(index) > 3 {
Expand All @@ -196,7 +198,7 @@ func (config *FileConfig) timestampFromLogLine(logValue string) time.Time {
}
if err != nil {
log.Printf("E! Error parsing timestampFromLogLine: %s", err)
return time.Time{}
return time.Time{}, logValue
}
if timestamp.Year() == 0 {
now := time.Now()
Expand All @@ -208,9 +210,13 @@ func (config *FileConfig) timestampFromLogLine(logValue string) time.Time {
timestamp = timestamp.AddDate(-1, 0, 0)
}
}
return timestamp
if config.TrimTimestamp {
// Trim the entire timestamp portion (from start to end of the match)
return timestamp, logValue[:index[0]] + logValue[index[1]:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the value of index[0] [index[1]? What are they corresponding to?

Reading the if cases above, it seems like the length can vary depending on the regex so want to make sure we are fine with always calling index 0 and 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex will always write down the index array with index[0] is the start position of the timestamp match
index[1] is the end position of the timestamp match

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also trim any leading whitespace after the timestamp trim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a point against doing this by default, the OTEL filelog receiver has preserve_(leading|trailing)_whitespace options in their config.

}
return timestamp, logValue
}
return time.Time{}
return time.Time{}, logValue
}

// This method determine whether the line is a start line for multiline log entry.
Expand Down
118 changes: 102 additions & 16 deletions plugins/inputs/logfile/fileconfig_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -130,15 +130,34 @@ func TestTimestampParser(t *testing.T) {
expectedTimestamp := time.Unix(1497882318, 0)
timestampString := "19 Jun 2017 14:25:18"
logEntry := fmt.Sprintf("%s [INFO] This is a test message.", timestampString)
timestamp := fileConfig.timestampFromLogLine(logEntry)
timestamp, modifiedLogEntry := fileConfig.timestampFromLogLine(logEntry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: logEntry is a better name since we don't actually modify the content here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the TrimTimestamp flag right. so I just kept it generic

assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, logEntry, modifiedLogEntry)

// Test regex match for multiline, the first timestamp in multiline should be matched
logEntry = fmt.Sprintf("%s [INFO] This is the first line.\n19 Jun 2017 14:25:19 [INFO] This is the second line.\n", timestampString)
timestamp = fileConfig.timestampFromLogLine(logEntry)
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, logEntry, modifiedLogEntry)

// Test TrimTimeStamp for single line
fileConfig.TrimTimestamp = true
logEntry = fmt.Sprintf("%s [INFO] This is a test message.", timestampString)
trimmedTimestampString := " [INFO] This is a test message."
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

// Test TrimTimeStamp for multiline, the first timestamp in multiline should be matched
logEntry = fmt.Sprintf("%s [INFO] This is the first line.\n19 Jun 2017 14:25:19 [INFO] This is the second line.\n", timestampString)
trimmedTimestampString = " [INFO] This is the first line.\n19 Jun 2017 14:25:19 [INFO] This is the second line.\n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)
}

func TestTimestampParserWithPadding(t *testing.T) {
Expand All @@ -155,15 +174,33 @@ func TestTimestampParserWithPadding(t *testing.T) {
Timezone: timezone,
TimezoneLoc: timezoneLoc}

logEntry := fmt.Sprintf(" 2 1 07:10:06 instance-id: i-02fce21a425a2efb3")
timestamp := fileConfig.timestampFromLogLine(logEntry)
logEntry := " 2 1 07:10:06 instance-id: i-02fce21a425a2efb3"
timestamp, modifiedLogEntry := fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 7, timestamp.Hour(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "7", timestamp.Hour()))
assert.Equal(t, 10, timestamp.Minute(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "10", timestamp.Minute()))
assert.Equal(t, logEntry, modifiedLogEntry)

logEntry = "2 1 07:10:06 instance-id: i-02fce21a425a2efb3"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 7, timestamp.Hour(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "7", timestamp.Hour()))
assert.Equal(t, 10, timestamp.Minute(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "10", timestamp.Minute()))
assert.Equal(t, logEntry, modifiedLogEntry)

//Test when TrimTimeStamp is enabled
fileConfig.TrimTimestamp = true
logEntry = " 2 1 07:10:06 instance-id: i-02fce21a425a2efb3"
trimmedTimestampString := " instance-id: i-02fce21a425a2efb3"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 7, timestamp.Hour(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "7", timestamp.Hour()))
assert.Equal(t, 10, timestamp.Minute(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "10", timestamp.Minute()))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

logEntry = fmt.Sprintf("2 1 07:10:06 instance-id: i-02fce21a425a2efb3")
timestamp = fileConfig.timestampFromLogLine(logEntry)
logEntry = "2 1 07:10:06 instance-id: i-02fce21a425a2efb3"
trimmedTimestampString = " instance-id: i-02fce21a425a2efb3"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 7, timestamp.Hour(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "7", timestamp.Hour()))
assert.Equal(t, 10, timestamp.Minute(), fmt.Sprintf("Timestamp does not match: %v, act: %v", "10", timestamp.Minute()))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)
}

func TestTimestampParserDefault(t *testing.T) {
Expand All @@ -183,26 +220,56 @@ func TestTimestampParserDefault(t *testing.T) {
TimezoneLoc: timezoneLoc}

// make sure layout is compatible for "Sep 9", "Sep 9" , "Sep 09", "Sep 09" options
logEntry := fmt.Sprintf("Sep 9 02:00:43 ip-10-4-213-132 \n")
timestamp := fileConfig.timestampFromLogLine(logEntry)
logEntry := "Sep 9 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry := fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, logEntry, modifiedLogEntry)

logEntry = "Sep 9 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, logEntry, modifiedLogEntry)

logEntry = fmt.Sprintf("Sep 9 02:00:43 ip-10-4-213-132 \n")
timestamp = fileConfig.timestampFromLogLine(logEntry)
logEntry = "Sep 09 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, logEntry, modifiedLogEntry)

logEntry = fmt.Sprintf("Sep 09 02:00:43 ip-10-4-213-132 \n")
timestamp = fileConfig.timestampFromLogLine(logEntry)
logEntry = "Sep 09 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, logEntry, modifiedLogEntry)

logEntry = fmt.Sprintf("Sep 09 02:00:43 ip-10-4-213-132 \n")
timestamp = fileConfig.timestampFromLogLine(logEntry)
// When TrimTimestamp is enabled, make sure layout is compatible for "Sep 9", "Sep 9" , "Sep 09", "Sep 09" options and log value is trimmed correctly
fileConfig.TrimTimestamp = true
logEntry = "Sep 9 02:00:43 ip-10-4-213-132 \n"
trimmedTimestampString := " ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

logEntry = "Sep 9 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

logEntry = "Sep 09 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

logEntry = "Sep 09 02:00:43 ip-10-4-213-132 \n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, 02, timestamp.Hour())
assert.Equal(t, 00, timestamp.Minute())
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)
}

func TestTimestampParserWithFracSeconds(t *testing.T) {
Expand All @@ -222,15 +289,34 @@ func TestTimestampParserWithFracSeconds(t *testing.T) {
expectedTimestamp := time.Unix(1497882318, 234000000)
timestampString := "19 Jun 2017 14:25:18,234088 UTC"
logEntry := fmt.Sprintf("%s [INFO] This is a test message.", timestampString)
timestamp := fileConfig.timestampFromLogLine(logEntry)
timestamp, modifiedLogEntry := fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, logEntry, modifiedLogEntry)

// Test regex match for multiline, the first timestamp in multiline should be matched
logEntry = fmt.Sprintf("%s [INFO] This is the first line.\n19 Jun 2017 14:25:19,123456 UTC [INFO] This is the second line.\n", timestampString)
timestamp = fileConfig.timestampFromLogLine(logEntry)
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, logEntry, modifiedLogEntry)

// Test TrimTimeStamp for single line
fileConfig.TrimTimestamp = true
logEntry = fmt.Sprintf("%s [INFO] This is a test message.", timestampString)
trimmedTimestampString := " [INFO] This is a test message."
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)

// Test TrimTimeStamp for multiline, the first timestamp in multiline should be matched
logEntry = fmt.Sprintf("%s [INFO] This is the first line.\n19 Jun 2017 14:25:19,123456 UTC [INFO] This is the second line.\n", timestampString)
trimmedTimestampString = " [INFO] This is the first line.\n19 Jun 2017 14:25:19,123456 UTC [INFO] This is the second line.\n"
timestamp, modifiedLogEntry = fileConfig.timestampFromLogLine(logEntry)
assert.Equal(t, expectedTimestamp.UnixNano(), timestamp.UnixNano(),
fmt.Sprintf("The timestampFromLogLine value %v is not the same as expected %v.", timestamp, expectedTimestamp))
assert.Equal(t, trimmedTimestampString, modifiedLogEntry)
}

func TestNonAllowlistedTimezone(t *testing.T) {
Expand Down
1 change: 1 addition & 0 deletions plugins/inputs/logfile/logfile.go
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ const sampleConfig = `
timestamp_regex = "^(\\d{2} \\w{3} \\d{4} \\d{2}:\\d{2}:\\d{2}).*$"
timestamp_layout = ["_2 Jan 2006 15:04:05"]
timezone = "UTC"
trim_timestamp = false
multi_line_start_pattern = "{timestamp_regex}"
## Read file from beginning.
from_beginning = false
Expand Down
19 changes: 11 additions & 8 deletions plugins/inputs/logfile/tailersrc.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ type tailerSrc struct {
stateFilePath string
tailer *tail.Tail
autoRemoval bool
timestampFn func(string) time.Time
timestampFn func(string) (time.Time, string)
enc encoding.Encoding
maxEventSize int
truncateSuffix string
Expand All @@ -91,7 +91,7 @@ func NewTailerSrc(
autoRemoval bool,
isMultilineStartFn func(string) bool,
filters []*LogFilter,
timestampFn func(string) time.Time,
timestampFn func(string) (time.Time, string),
enc encoding.Encoding,
maxEventSize int,
truncateSuffix string,
Expand Down Expand Up @@ -195,9 +195,10 @@ func (ts *tailerSrc) runTail() {
if !ok {
if msgBuf.Len() > 0 {
msg := msgBuf.String()
timestamp, modifiedMsg := ts.timestampFn(msg)
e := &LogEvent{
msg: msg,
t: ts.timestampFn(msg),
msg: modifiedMsg,
t: timestamp,
offset: *fo,
src: ts,
}
Expand Down Expand Up @@ -249,9 +250,10 @@ func (ts *tailerSrc) runTail() {

if msgBuf.Len() > 0 {
msg := msgBuf.String()
timestamp, modifiedMsg := ts.timestampFn(msg)
e := &LogEvent{
msg: msg,
t: ts.timestampFn(msg),
msg: modifiedMsg,
t: timestamp,
offset: *fo,
src: ts,
}
Expand All @@ -276,9 +278,10 @@ func (ts *tailerSrc) runTail() {
}

msg := msgBuf.String()
timestamp, modifiedMsg := ts.timestampFn(msg)
e := &LogEvent{
msg: msg,
t: ts.timestampFn(msg),
msg: modifiedMsg,
t: timestamp,
offset: *fo,
src: ts,
}
Expand Down
4 changes: 2 additions & 2 deletions plugins/inputs/logfile/tailersrc_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -324,15 +324,15 @@ func TestTailerSrcFiltersMultiLineLogs(t *testing.T) {
assertExpectedLogsPublished(t, n, int(*resources.consumed))
}

func parseRFC3339Timestamp(line string) time.Time {
func parseRFC3339Timestamp(line string) (time.Time, string) {
// Use RFC3339 for testing `2006-01-02T15:04:05Z07:00`
re := regexp.MustCompile(`\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[Z+\-]\d{2}:\d{2}`)
tstr := re.FindString(line)
var t time.Time
if tstr != "" {
t, _ = time.Parse(time.RFC3339, tstr)
}
return t
return t, line
}

func logLine(s string, l int, t time.Time) string {
Expand Down
4 changes: 4 additions & 0 deletions translator/config/schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -959,6 +959,10 @@
"UTC"
]
},
"trim_timestamp" : {
"type": "boolean",
"description": "Whether to trim the timestamp in the log messageuniso"
},
"encoding": {
"type": "string",
"minLength": 1,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
pipe = false
retention_in_days = 5
timezone = "UTC"
trim_timestamp = true

[[inputs.logfile.file_config]]
auto_removal = true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,7 @@
"log_group_name": "amazon-cloudwatch-agent.log",
"log_stream_name": "amazon-cloudwatch-agent.log",
"timezone": "UTC",
"trim_timestamp": true,
"retention_in_days": 5
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
pipe = false
retention_in_days = 5
timezone = "UTC"
trim_timestamp = true

[[inputs.logfile.file_config]]
auto_removal = true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,8 @@
"log_group_name": "amazon-cloudwatch-agent.log",
"log_stream_name": "amazon-cloudwatch-agent.log",
"timezone": "UTC",
"retention_in_days": 5
"retention_in_days": 5,
"trim_timestamp": true
},
{
"file_path": "/opt/aws/amazon-cloudwatch-agent/logs/test.log",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ type (
Pipe bool
RetentionInDays int `toml:"retention_in_days"`
Timezone string
//Customer specifies if the timestamp from the log message should be trimmed
TrimTimestamp bool `toml:"trim_timestamp"`
//Customer specified service.name
ServiceName string `toml:"service_name"`
//Customer specified deployment.environment
Expand Down
Loading
Loading