Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: replace gosigar with gopsutil #21041

Merged
merged 13 commits into from
Jun 2, 2020
27 changes: 11 additions & 16 deletions cmd/geth/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,12 @@ import (
"fmt"
"math"
"os"
"runtime"
godebug "runtime/debug"
"sort"
"strconv"
"strings"
"time"

"github.com/elastic/gosigar"
"github.com/ethereum/go-ethereum/accounts"
"github.com/ethereum/go-ethereum/accounts/keystore"
"github.com/ethereum/go-ethereum/cmd/utils"
Expand All @@ -42,6 +40,7 @@ import (
"github.com/ethereum/go-ethereum/log"
"github.com/ethereum/go-ethereum/metrics"
"github.com/ethereum/go-ethereum/node"
gopsutil "github.com/shirou/gopsutil/mem"
cli "gopkg.in/urfave/cli.v1"
)

Expand Down Expand Up @@ -310,20 +309,16 @@ func prepare(ctx *cli.Context) {
ctx.GlobalSet(utils.CacheFlag.Name, strconv.Itoa(128))
}
// Cap the cache allowance and tune the garbage collector
var mem gosigar.Mem
// Workaround until OpenBSD support lands into gosigar
// Check https://github.com/elastic/gosigar#supported-platforms
if runtime.GOOS != "openbsd" {
if err := mem.Get(); err == nil {
if 32<<(^uintptr(0)>>63) == 32 && mem.Total > 2*1024*1024*1024 {
log.Warn("Lowering memory allowance on 32bit arch", "available", mem.Total/1024/1024, "addressable", 2*1024)
mem.Total = 2 * 1024 * 1024 * 1024
}
allowance := int(mem.Total / 1024 / 1024 / 3)
if cache := ctx.GlobalInt(utils.CacheFlag.Name); cache > allowance {
log.Warn("Sanitizing cache to Go's GC limits", "provided", cache, "updated", allowance)
ctx.GlobalSet(utils.CacheFlag.Name, strconv.Itoa(allowance))
}
mem, err := gopsutil.VirtualMemory()
if err == nil {
if 32<<(^uintptr(0)>>63) == 32 && mem.Total > 2*1024*1024*1024 {
log.Warn("Lowering memory allowance on 32bit arch", "available", mem.Total/1024/1024, "addressable", 2*1024)
mem.Total = 2 * 1024 * 1024 * 1024
}
allowance := int(mem.Total / 1024 / 1024 / 3)
if cache := ctx.GlobalInt(utils.CacheFlag.Name); cache > allowance {
log.Warn("Sanitizing cache to Go's GC limits", "provided", cache, "updated", allowance)
ctx.GlobalSet(utils.CacheFlag.Name, strconv.Itoa(allowance))
}
}
// Ensure Go's GC ignores the database cache for trigger percentage
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ require (
github.com/docker/docker v1.4.2-0.20180625184442-8e610b2b55bf
github.com/dop251/goja v0.0.0-20200219165308-d1232e640a87
github.com/edsrzf/mmap-go v0.0.0-20160512033002-935e0e8a636c
github.com/elastic/gosigar v0.8.1-0.20180330100440-37f05ff46ffa
github.com/fatih/color v1.3.0
github.com/fjl/memsize v0.0.0-20180418122429-ca190fb6ffbc
github.com/gballet/go-libpcsclite v0.0.0-20190607065134-2772fd86a8ff
Expand Down Expand Up @@ -50,6 +49,7 @@ require (
github.com/rjeczalik/notify v0.9.1
github.com/rs/cors v0.0.0-20160617231935-a62a804a8a00
github.com/rs/xhandler v0.0.0-20160618193221-ed27b6fd6521 // indirect
github.com/shirou/gopsutil v2.20.5-0.20200531151128-663af789c085+incompatible
github.com/status-im/keycard-go v0.0.0-20190316090335-8537d3370df4
github.com/steakknife/bloomfilter v0.0.0-20180922174646-6819c0d2a570
github.com/steakknife/hamming v0.0.0-20180906055917-c99c65617cd3 // indirect
Expand Down
6 changes: 4 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,6 @@ github.com/dop251/goja v0.0.0-20200219165308-d1232e640a87 h1:OMbqMXf9OAXzH1dDH82
github.com/dop251/goja v0.0.0-20200219165308-d1232e640a87/go.mod h1:Mw6PkjjMXWbTj+nnj4s3QPXq1jaT0s5pC0iFD4+BOAA=
github.com/edsrzf/mmap-go v0.0.0-20160512033002-935e0e8a636c h1:JHHhtb9XWJrGNMcrVP6vyzO4dusgi/HnceHTgxSejUM=
github.com/edsrzf/mmap-go v0.0.0-20160512033002-935e0e8a636c/go.mod h1:YO35OhQPt3KJa3ryjFM5Bs14WD66h8eGKpfaBNrHW5M=
github.com/elastic/gosigar v0.8.1-0.20180330100440-37f05ff46ffa h1:XKAhUk/dtp+CV0VO6mhG2V7jA9vbcGcnYF/Ay9NjZrY=
github.com/elastic/gosigar v0.8.1-0.20180330100440-37f05ff46ffa/go.mod h1:cdorVVzy1fhmEqmtgqkoE3bYtCfSCkVyjTyCIo22xvs=
github.com/fatih/color v1.3.0 h1:YehCCcyeQ6Km0D6+IapqPinWBK6y+0eB5umvZXK9WPs=
github.com/fatih/color v1.3.0/go.mod h1:Zm6kSWBoL9eyXnKyktHP6abPY2pDugNf5KwzbycvMj4=
github.com/fjl/memsize v0.0.0-20180418122429-ca190fb6ffbc h1:jtW8jbpkO4YirRSyepBOH8E+2HEw6/hKkBvFPwhUN8c=
Expand Down Expand Up @@ -167,6 +165,10 @@ github.com/rs/cors v0.0.0-20160617231935-a62a804a8a00/go.mod h1:gFx+x8UowdsKA9Ac
github.com/rs/xhandler v0.0.0-20160618193221-ed27b6fd6521 h1:3hxavr+IHMsQBrYUPQM5v0CgENFktkkbg1sfpgM3h20=
github.com/rs/xhandler v0.0.0-20160618193221-ed27b6fd6521/go.mod h1:RvLn4FgxWubrpZHtQLnOf6EwhN2hEMusxZOhcW9H3UQ=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/shirou/gopsutil v2.20.4+incompatible h1:cMT4rxS55zx9NVUnCkrmXCsEB/RNfG9SwHY9evtX8Ng=
github.com/shirou/gopsutil v2.20.4+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA=
github.com/shirou/gopsutil v2.20.5-0.20200531151128-663af789c085+incompatible h1:+gAR1bMhuoQnZMTWFIvp7ukynULPsteLzG+siZKLtD8=
github.com/shirou/gopsutil v2.20.5-0.20200531151128-663af789c085+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA=
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
github.com/status-im/keycard-go v0.0.0-20190316090335-8537d3370df4 h1:Gb2Tyox57NRNuZ2d3rmvB3pcmbu7O1RS3m8WRx7ilrg=
Expand Down
20 changes: 14 additions & 6 deletions metrics/cpu_enabled.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,22 @@

package metrics

import "github.com/elastic/gosigar"
import (
"github.com/ethereum/go-ethereum/log"
"github.com/shirou/gopsutil/cpu"
)

// ReadCPUStats retrieves the current CPU stats.
func ReadCPUStats(stats *CPUStats) {
global := gosigar.Cpu{}
global.Get()

stats.GlobalTime = int64(global.User + global.Nice + global.Sys)
stats.GlobalWait = int64(global.Wait)
// passing false to request all cpu times
timeStats, err := cpu.Times(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation,

TimesStat contains the amounts of time the CPU has spent performing different kinds of work. Time units are in USER_HZ or Jiffies (typically hundredths of a second). It is based on linux /proc/stat file.

So you need to make sure that the units are the same as those returned by gosigar. I wanted to check the documentation and their website seems to be down. I suggest you compare both outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No they are not the same units.

Gosigar uses a unit that is 128*(1 jiffy)

I just pushed a potential fix here: 397b150

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the 128 come from? Could yo upoint me to the gosigar code? I don't see any processing made by gosigar

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems to contradict the documentation and indeed return times in seconds (in float, though 😞 )

In any case, it is unlikely that the 128 * jiffies is the same on all platforms. That's not looking too good if we can't get this to work with seconds ☹️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it super odd if they returned anything that percentage or millisecond. That's the expectation .If they returned different on different platforms, nobody would use them. I think something's wrong with our assessment above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@renaynay I created this test program and ran it on Linux like this:

> time ./test-gopsutil 
124820.68 28481.84 126.54 397.03 1.03505072e+06
12482068 2848184 12654 39703
./test-gopsutil  0.01s user 0.01s system 0% cpu 5.015 total

I do not find a difference of 128 but of 100 between the two. So if anything else, it proves that the 128 is platform-dependent. I'd be curious to find out what the output is when you run the same command on darwin, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karalabe They do return "CPU ticks" which is neither a percentage nor a millisecond, and they are platform dependent:

parseCPUStats, defined in https://github.com/elastic/gosigar/blob/master/sigar_linux.go#L95, parses a line taken from /proc/stat which returns values in "cpu ticks" and this is platform-dependent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which makes me think: we can calculate the percentages. This would be accurate, whatever the unit is that is returned as they are divided by one another.

if err != nil {
log.Error("Could not read cpu stats", "err", err)
return
}
// requesting all cpu times will always return an array with only one time stats entry
timeStat := timeStats[0]
stats.GlobalTime = int64((timeStat.User + timeStat.Nice + timeStat.System) * cpu.ClocksPerSec)
stats.GlobalWait = int64((timeStat.Iowait) * cpu.ClocksPerSec)
stats.LocalTime = getProcessCPUTime()
}