Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nimbus reports el_offline=true even when synced #4987

Closed
torfbolt opened this issue May 23, 2023 · 7 comments
Closed

Nimbus reports el_offline=true even when synced #4987

torfbolt opened this issue May 23, 2023 · 7 comments

Comments

@torfbolt
Copy link

Describe the bug
With my Nethermind/Nimbus combination, the sync status API always reports el_offline=true, even if all clients are in sync. This disrupts the logic implemented in sigp/lighthouse#4295, so my multi-BN setup is now essentially running only on the non-Nimbus BN.

To Reproduce
Steps to reproduce the behavior:

  1. Ubuntu 22.04, Nethermind 1.18.1, Nimbus 23.5.1
  2. Wait until EL and BN are synced
  3. curl localhost:5052/eth/v1/node/syncing
  4. Result is {"data":{"head_slot":"6504090","sync_distance":"0","is_syncing":false,"is_optimistic":false,"el_offline":true}}
@michaelsproul
Copy link
Contributor

I'm noticing this too, testing with Nimbus v23.5.1

@torfbolt
Copy link
Author

Based on this comment by @arnetheduck : "if the node reports a non-optimistic head, it's synced and usable already by definition"
Would it make sense to extend the current code with el_offline &= is_optimistic? I.e. the EL node can only be reported as offline if we are in optimistic sync mode. Or is this too simplistic and doesn't cover some edge cases?

@etan-status
Copy link
Contributor

etan-status commented May 25, 2023

That doesn't cover the case where slots are missed, in which case BN remains synced (not optimistic), but EL may still go offline. Still want to tell VC to prefer other BN in that situation (for purpose of engine_getPayload which requires working EL). For all other calls beside engine_getPayload, the EL being offline doesn't matter – and even for engine_getPayload, can just try regardless of sync status, and hope that the issue resolved itself by the time block is produced.

About the underlying issue why we think EL is offline, still investigating.

@julian-st
Copy link
Contributor

This bug also happens with geth v1.12.0 on Ubuntu 22.04

@etan-status
Copy link
Contributor

Update: Should be fixed as of #4991 (currently part of Nimbus unstable)

@aliask
Copy link

aliask commented Nov 28, 2023

Looks fixed to me with geth/Nimbus combo:

$ curl 127.0.0.1:5052/eth/v1/node/syncing -v
* processing: 127.0.0.1:5052/eth/v1/node/syncing
*   Trying 127.0.0.1:5052...
* Connected to 127.0.0.1 (127.0.0.1) port 5052
> GET /eth/v1/node/syncing HTTP/1.1
> Host: 127.0.0.1:5052
> User-Agent: curl/8.2.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nim-presto/0.0.3 (amd64/linux)
< Content-Length: 112
< Content-Type: application/json
< Date: Tue, 28 Nov 2023 10:48:39 GMT
< Connection: close
<
* Closing connection
{"data":{"head_slot":"7862041","sync_distance":"0","is_syncing":false,"is_optimistic":false,"el_offline":false}}

$ /usr/local/bin/nimbus_beacon_node --version
Nimbus beacon node v23.10.1-d19ffc-stateofus
Copyright (c) 2019-2023 Status Research & Development GmbH

eth2 specification v1.4.0-beta.2-hotfix

Nim Compiler Version 1.6.14 [Linux: amd64]

Time to close the issue?

@tersec
Copy link
Contributor

tersec commented Nov 28, 2023

Yes, given that #4991 should have fixed it, and there's some evidence it is indeed fixed, closing this issue. If it's still extant it can be re-opened, or if it's different, another issue can be opened.

@tersec tersec closed this as completed Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants