Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

Federation indirectly broken due to failure in device list updates #1864

Closed
git-bruh opened this issue May 22, 2021 · 3 comments
Closed

Federation indirectly broken due to failure in device list updates #1864

git-bruh opened this issue May 22, 2021 · 3 comments

Comments

@git-bruh
Copy link

Background information

  • Dendrite version or git SHA: >= e9e03090896421512c88dc6c663a2049b8a11755
  • Monolith or Polylith?: Monolith
  • SQLite3 or Postgres?: sqlite
  • Running in Docker?: no
  • go version: 1.16.4

Description

Dendrite seems to be having indirect federation issues (continuous backoffs) due to failures in sending device list updates to 2 homeservers. This is a fresh dendrite set up 2-3 days ago, had the same issue on an older one which had been running for 2-3 months, had to nuke it.

Caddyfile:

dendrite-testuser.duckdns.org, dendrite-testuser.duckdns.org:8448 {
        handle /.well-known/matrix/server {
                respond {"m.server":"dendrite-testuser.duckdns.org:443"}
        }

        handle /.well-known/matrix/client {
                respond {"m.homeserver":{"base_url":"https://dendrite-testuser.duckdns.org"}}
        }

        handle /_matrix/* {
                reverse_proxy http://127.0.0.1:8008
        }

        respond "Hello"
}

It randomly works at times (maybe every couple of hours), dendrite then sends a bunch of old events to the homeservers it was backing off from.
Incoming federation works fine.

It was suggested in the matrix room that caddy might be messing with the url encoding causing the request to fail (#1601), but removing the reverse proxy and exposing dendrite on port 8448 directly didn't seem to make any difference.

Log (without reverse proxy, output is almost the same. Irrelevant parts like dead homeservers ommited.), dendrite doesn't seem to log the exact reason for the request's failure:

time="2021-05-22T13:34:42Z" level=warning msg="WARNING: Application service option rate_limited is currently unimplemented"
time="2021-05-22T13:34:42.868422486Z" level=info msg="Dendrite version 0.3.11" func="NewBaseDendrite\n\t" file=" [github.com/matrix-org/dendrite/setup/base.go:112]"
time="2021-05-22T13:34:42.870587544Z" level=info msg="Enabled perspective key fetcher" func="NewInternalAPI\n\t" file=" [github.com/matrix-org/dendrite/signingkeyserver/signingkeyserver.go:103]" num_public_keys=2 server_name=matrix.org
time="2021-05-22T13:34:43.054100894Z" level=error msg="Failed to get server ACLs for room \"!YoGoKMxyeHmsYnGehS:kde.org\"" func="NewServerACLs\n\t" file=" [github.com/matrix-org/dendrite/roomserver/acls/acls.go:60]" error="storage: state NIDs missing from the database (0 != 1)"
time="2021-05-22T13:34:43.102681984Z" level=info msg=PerformDeviceCreation func="PerformDeviceCreation\n\t" file=" [github.com/matrix-org/dendrite/userapi/internal/api.go:115]" context=missing device_id=0xc000108740 display_name=0xc000108740 localpart=appservice-discord
time="2021-05-22T13:34:43.103771730Z" level=error msg="failed to query device keys for some users" func="processServer\n\t" file=" [github.com/matrix-org/dendrite/keyserver/internal/device_list_update.go:377]" context=missing failed=1 server_name=matrixorg total=1 wait=8h0m0s
time="2021-05-22T13:34:43.105960299Z" level=info msg="Produced to key change topic 'DendriteOutputKeyChangeEvent'" func="ProduceKeyChanges\n\t" file=" [github.com/matrix-org/dendrite/keyserver/producers/keychange.go:72]" num_key_changes=1 user_id="@appservice-discord:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:43.129926192Z" level=info msg="Sending device list update message to [\"dendrite-testuser.duckdns.org\" \"matrix.org\" \"kde.org\"]" func="onMessage\n\t" file=" [github.com/matrix-org/dendrite/federationsender/consumers/keychange.go:132]"
time="2021-05-22T13:34:43.130011722Z" level=info msg="Sending EDU event" func="SendEDU\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/queue.go:310]" destinations=2 edu_type=m.device_list_update
time="2021-05-22T13:34:43.131075031Z" level=info msg="Starting application service" func="worker\n\t" file=" [github.com/matrix-org/dendrite/appservice/workers/transaction_scheduler.go:64]" appservice=discord
time="2021-05-22T13:34:43.136085676Z" level=info msg="Starting external Monolith listener on :8008" func="func2\n\t" file=" [github.com/matrix-org/dendrite/setup/base.go:427]"
time="2021-05-22T13:34:43.136519500Z" level=info msg="Starting external Monolith listener on :8448" func="func2\n\t" file=" [github.com/matrix-org/dendrite/setup/base.go:427]"
time="2021-05-22T13:34:45.459811583Z" level=info msg="Received transaction \"1621526293436\" from \"thomcat.rocks\" containing 0 PDUs, 1 EDUs" func="Send\n\t" file=" [github.com/matrix-org/dendrite/federationapi/routing/send.go:142]" req.id=G6zEHnPU0SR1 req.method=PUT req.path=/_matrix/federation/v1/send/1621526293436
time="2021-05-22T13:34:46.321286766Z" level=info msg="Producing to topic 'DendriteOutputClientData'" func="SendData\n\t" file=" [github.com/matrix-org/dendrite/clientapi/producers/syncapi.go:51]" data_type=im.vector.setting.breadcrumbs room_id="" user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:46.322333105Z" level=info msg="received data from client API server" func="onMessage\n\t" file=" [github.com/matrix-org/dendrite/syncapi/consumers/clientapi.go:89]" room_id="" type=im.vector.setting.breadcrumbs
time="2021-05-22T13:34:47.874257245Z" level=info msg="Producing to topic 'DendriteOutputTypingEvent'" func="sendTypingEvent\n\t" file=" [github.com/matrix-org/dendrite/eduserver/input/input.go:106]" room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" typing=true user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:47.876697511Z" level=info msg="Sending EDU event" func="SendEDU\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/queue.go:310]" destinations=2 edu_type=m.typing
time="2021-05-22T13:34:48.780331707Z" level=warning msg="Backing off \"kde.org\" for 3.254043233s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:49.090100882Z" level=warning msg="Backing off \"matrix.org\" for 2.973380349s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:49.664938557Z" level=info msg="Received transaction \"1621460660046\" from \"joeth.uk\" containing 0 PDUs, 1 EDUs" func="Send\n\t" file=" [github.com/matrix-org/dendrite/federationapi/routing/send.go:142]" req.id=5K5ZALoMhq7d req.method=PUT req.path=/_matrix/federation/v1/send/1621460660046
time="2021-05-22T13:34:49.684387981Z" level=info msg="Producing to topic 'DendriteOutputRoomEvent'" func="WriteOutputEvents\n\t" file=" [github.com/matrix-org/dendrite/roomserver/internal/input/input.go:104]" adds_state=0 event_id="$dome_qpygGF1CJNQ3BD4LaqFHDQ1veXJ-keAXmz0sZc" event_type=m.room.message removes_state=0 room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" send_as_server=dendrite-testuser.duckdns.org sender="@testuser:dendrite-testuser.duckdns.org" type=new_room_event
time="2021-05-22T13:34:49.688015156Z" level=info msg="Sent event to roomserver" func="SendEvent\n\t" file=" [github.com/matrix-org/dendrite/clientapi/routing/sendevent.go:134]" event_id="$dome_qpygGF1CJNQ3BD4LaqFHDQ1veXJ-keAXmz0sZc" req.id=BtEN2XGlLCBJ req.method=PUT req.path="/_matrix/client/r0/rooms/!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org/send/m.room.message/$local.233b85ef-8d58-4d04-8085-7d528c6f3909" room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" room_version=6 user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:49.698216484Z" level=info msg="Sending event" func="SendEvent\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/queue.go:238]" destinations=2 event="$dome_qpygGF1CJNQ3BD4LaqFHDQ1veXJ-keAXmz0sZc"
time="2021-05-22T13:34:49.708944485Z" level=warning msg="Backing off \"matrix.org\" for 2.354545809s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:49.713902712Z" level=warning msg="Backing off \"kde.org\" for 2.320476922s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:49.809930850Z" level=info msg="Producing to topic 'DendriteOutputTypingEvent'" func="sendTypingEvent\n\t" file=" [github.com/matrix-org/dendrite/eduserver/input/input.go:106]" room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" typing=false user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:49.812827062Z" level=info msg="Sending EDU event" func="SendEDU\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/queue.go:310]" destinations=2 edu_type=m.typing
time="2021-05-22T13:34:49.819864359Z" level=warning msg="Backing off \"matrix.org\" for 2.243616894s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:49.824151161Z" level=warning msg="Backing off \"kde.org\" for 2.210222741s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:50.222509377Z" level=info msg="Producing to topic 'DendriteOutputRoomEvent'" func="WriteOutputEvents\n\t" file=" [github.com/matrix-org/dendrite/roomserver/internal/input/input.go:104]" adds_state=0 event_id="$KIHgjQGgd8XBX7zXBPLPH-B2ly4enB6K4E1Dzh1hfl4" event_type=m.room.message removes_state=0 room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" send_as_server=dendrite-testuser.duckdns.org sender="@testuser:dendrite-testuser.duckdns.org" type=new_room_event
time="2021-05-22T13:34:50.224773718Z" level=info msg="Sent event to roomserver" func="SendEvent\n\t" file=" [github.com/matrix-org/dendrite/clientapi/routing/sendevent.go:134]" event_id="$KIHgjQGgd8XBX7zXBPLPH-B2ly4enB6K4E1Dzh1hfl4" req.id=exkgh6JuzqDh req.method=PUT req.path="/_matrix/client/r0/rooms/!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org/send/m.room.message/$local.cc290194-c242-46e3-a027-397fa305f511" room_id="!5bg3zjFEMuDOrLo6:dendrite-testuser.duckdns.org" room_version=6 user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:50.299108439Z" level=warning msg="Backing off \"matrix.org\" for 1.764376622s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:50.312581419Z" level=warning msg="Backing off \"kde.org\" for 1.721796777s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:52.221391737Z" level=warning msg="Backing off \"matrix.org\" for 7.949758566s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:52.242750478Z" level=warning msg="Backing off \"matrix.org\" for 7.928399992s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:52.331889964Z" level=info msg="Producing to topic 'DendriteOutputClientData'" func="SendData\n\t" file=" [github.com/matrix-org/dendrite/clientapi/producers/syncapi.go:51]" data_type=m.fully_read room_id="!cAof7rphhaWHIAZs:dendrite-testuser.duckdns.org" user_id="@testuser:dendrite-testuser.duckdns.org"
time="2021-05-22T13:34:52.332679896Z" level=info msg="Producing to topic 'DendriteOutputReceiptEvent'" func="InputReceiptEvent\n\t" file=" [github.com/matrix-org/dendrite/eduserver/input/input.go:186]"
time="2021-05-22T13:34:52.335124006Z" level=info msg="received data from client API server" func="onMessage\n\t" file=" [github.com/matrix-org/dendrite/syncapi/consumers/clientapi.go:89]" room_id="!cAof7rphhaWHIAZs:dendrite-testuser.duckdns.org" type=m.fully_read
time="2021-05-22T13:34:52.336553736Z" level=info msg="Sending EDU event" func="SendEDU\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/queue.go:310]" destinations=2 edu_type=m.receipt
time="2021-05-22T13:34:52.401338156Z" level=warning msg="Backing off \"matrix.org\" for 7.769817072s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:52.496616833Z" level=warning msg="Backing off \"kde.org\" for 7.604975608s" func="backgroundSend\n\t" file=" [github.com/matrix-org/dendrite/federationsender/queue/destinationqueue.go:301]"
time="2021-05-22T13:34:52.622140484Z" level=info msg="Fetching 1 key(s)" func="handleFetcherKeys\n\t" file=" [github.com/matrix-org/dendrite/signingkeyserver/internal/api.go:208]" fetcher_name="perspective server matrix.org"
time="2021-05-22T13:34:52.714432331Z" level=info msg="Updated 1 of 1 key(s) in database (0 keys remaining)" func="handleFetcherKeys\n\t" file=" [github.com/matrix-org/dendrite/signingkeyserver/internal/api.go:262]" fetcher_name="perspective server matrix.org"

Steps to reproduce

  • Set up dendrite
  • ???
  • Profit
@kegsay
Copy link
Member

kegsay commented May 23, 2021

On dendrite.matrix.org we also see problems with kde.org and matrix.org with regards to device update frequency. There must be something about their setups causing dendrite to become upset.

@kegsay
Copy link
Member

kegsay commented Jul 9, 2021

Possibly related: #1601

@S7evinK
Copy link
Contributor

S7evinK commented Oct 20, 2022

This is/was most likely related to invalid device JSON values, which was fixed in matrix-org/gomatrixserverlib#342, we now shouldn't back off servers that long anymore.
Closing now, if this is still happening, I'm happy to re-open this issue.

@S7evinK S7evinK closed this as completed Oct 20, 2022
brianathere pushed a commit to HereNotThere/dendrite that referenced this issue Mar 29, 2023
This PR adds sepolia subroutines to dendrite and the web app. Once we
merge and build this:

[ ] we'll update the CHAIN_ID env var on the prod ECS task definition
[ ] and manually deploy to production render web app (since prod is no
longer associated with a branch)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants