Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes: rexmit mechanism & hold #48

Merged
merged 6 commits into from
Jul 1, 2024
Merged

Fixes: rexmit mechanism & hold #48

merged 6 commits into from
Jul 1, 2024

Conversation

TMRh20
Copy link
Member

@TMRh20 TMRh20 commented Jul 1, 2024

  • Fix: re-transmit mechanism - this was not working if waiting for an ack due to the hold variable being set.
  • Fix: hold variable was not being set or unset where it should be
  • Remove out_pos & dataPos variables from userdata struct, replace with data_pos variable
  • Add attribute_packed to the userdata struct since it is now mis-aligned
  • Now only transmit one RF24Network payload instead of a retry, since the rexmit mechanism is now working

In summary, the UIP IP stack was calling the rexmit() function, but the u->hold variable would be set, so it wouldn't re-transmit, so I added some logic specific to the rexmit() function and it now works. The hold variable also wasn't always being set/unset when it should be, so I fixed that logic also.

This was all because my MQTT nodes weren't staying connected long enough, and they would mysteriously disconnect after random periods of time. It turns out that a single failure in communication would result in a timeout, so now that the retry mechanism is working, it should be much much longer before a node loses its connection to MQTT.

I also had the RF24Network layer re-transmitting if a payload failed, which could result in duplicate data/duplicate ping results, but that behaviour has been removed due to the rexmit mechanism of the IP stack working now.

- Fix: re-transmit mechanism - this was not working if waiting for an ack due to the `hold` variable being set.
- Fix: `hold` variable was not being set or unset where it should be
- Remove out_pos & dataPos variables from userdata struct, replace with data_pos variable
- Add attribute_packed to the userdata struct since it is now mis-aligned
- Now only transmit one RF24Network payload instead of a retry, since the rexmit mechanism is now working
@TMRh20 TMRh20 added the bug label Jul 1, 2024
@TMRh20 TMRh20 requested a review from 2bndy5 July 1, 2024 12:50
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cpp-linter Review

Click here for the full clang-format patch
diff --git a/RF24Ethernet.cpp b/RF24Ethernet.cpp
index e231644..4792a57 100644
--- a/RF24Ethernet.cpp
+++ b/RF24Ethernet.cpp
@@ -326 +326 @@ void RF24EthernetClass::network_send()
-        RF24Ethernet.network.write(headerOut, uip_buf, uip_len);
+    RF24Ethernet.network.write(headerOut, uip_buf, uip_len);

Have any feedback or feature suggestions? Share it here.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@github-actions github-actions bot dismissed their stale review July 1, 2024 12:55

outdated suggestion

TMRh20 added 3 commits July 1, 2024 07:03
- Fix one thing, break another...
need to separate incoming and outgoing data counters
@TMRh20 TMRh20 merged commit 992c185 into master Jul 1, 2024
24 checks passed
@TMRh20 TMRh20 deleted the fixOutgoingData branch July 1, 2024 21:04
@TMRh20
Copy link
Member Author

TMRh20 commented Jul 2, 2024

Just as I suspected, this makes a world of difference in how long nodes can remain connected with a given protocol. With MQTT specifically, nodes are not timing out randomly, only when address renewal takes too long etc and/or there is a partial re-convergence of the mesh.

As per the vid, the shortest connection time after leaving overnight is 56minutes, while the longest connection time is 14.37 hours.
https://github.com/nRF24/RF24Ethernet/assets/2610460/bf05a01f-d42b-404c-8349-04f4c5406202

I wouldn't be surprised if some nodes go days now without needing to re-connect to the MQTT server. This affects other protocols like HTTP too. The mesh is staying much more stable now too due to the changes in connection checking, so that adds to this result.

*Edit: It looks like my calculations were a bit off, only working up to around 18-hours because of a uint16_t value in the code, so some nodes showing around 4-hours connectivity were actually connected for around 22-hours. (18 * 60 * 60 = 64,800)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants