Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attributable failures #2256

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

joostjager
Copy link
Contributor

@joostjager joostjager commented May 2, 2023

Implements lightning/bolts#1044

This PR implements the update_fail_htlc optional attribution_data field as tlv type 101 rather than type 1 until interop with other node software is achieved.

Until that time, the feature bit doesn't need to be set either.

@joostjager
Copy link
Contributor Author

Single hop interop testing passes.

Multi hop interop testing seems to be blocked by gossip propagation issues between lnd and rust-lightning.

@TheBlueMatt
Copy link
Collaborator

Nice! That's not too bad, thanks for working on this. I'll dig into the crypto in a day or two. Have a number of comments on the code itself and structure, but I assume its not really worth digging into until we support both old and new errors? I'm happy to give more code-level feedback now if you prefer.

Multi hop interop testing seems to be blocked by gossip propagation issues between lnd and rust-lightning.

Oh? Is this some known issue on the lnd end? I'm not aware of any gossip errors.

@joostjager
Copy link
Contributor Author

Have a number of comments on the code itself and structure, but I assume its not really worth digging into until we support both old and new errors? I'm happy to give more code-level feedback now if you prefer.

This doesn't surprise me. This is my first venture into Rust land. It is a bit of a switch from golang, and I need to get used to how things are done. The language is not bad though, I can see why people fall in love with it! But yes, can hold off on code-level comments for now.

Oh? Is this some known issue on the lnd end? I'm not aware of any gossip errors.

Not sure if it is a known issue. I've had nagging gossip issues before when I tried to do a pathfinding benchmark. For this PR, I did apply a patch to rust-lightning somewhere to force-send node announcement always and then it worked better. Will try to come up with a reasonable repro scenario.

@TheBlueMatt
Copy link
Collaborator

What are you using to do the testing? I assume ldk-sample of some form? If you change the timer at https://github.com/lightningdevkit/ldk-sample/blob/main/src/main.rs#L791 it will rebroadcast a fresh node announcement more aggressively.

@joostjager
Copy link
Contributor Author

Yes, ldk-sample. It was doing the timer alright, but somehow it got filtered out in the timer handler.

@TheBlueMatt
Copy link
Collaborator

Would be happy to take a look at logs. At the TRACE level we should basically be writing everything that is going out or coming in on the wire.

@joostjager
Copy link
Contributor Author

Yes, so I did see that the announcement wasn't going out. Will continue tomorrow and get back with more data.

@TheBlueMatt
Copy link
Collaborator

Errr, right, so to avoid dumping tons of gossip crap that message is only really logged at a high level (Broadcasting NodeAnnouncement after passing it to our own RoutingMessageHandler.), with the remaining logs on a per-peer level being at the GOSSIP level (disabled by default cause its verbose) - Sending message to all peers except {:?} or the announced node: {:?} and Skipping broadcast message to {:?} as its outbound buffer is full

@joostjager
Copy link
Contributor Author

Looks like timer is at 60s. I definitely waited much longer than that, so doesn't seem to be the problem. Enabled gossip logging and saved log files. Continuing in #2259

@joostjager
Copy link
Contributor Author

With gossip fixed with the hint in #2259, I was able to run through a few of the multi-hop inter-op scenarios:
LND -> LDK -> LDK
LDK -> LDK -> LND
LND -> LDK (intermediate failure)

Obviously there are more, but I think this is a good enough sanity check for now.

As mentioned above, attention should go to the crypto part of this feature first.

@joostjager joostjager changed the title Convert to attributable errors Attributable failures Mar 11, 2025
@joostjager joostjager marked this pull request as ready for review March 11, 2025 15:41
@joostjager
Copy link
Contributor Author

Pushed 2025 version of the code to this PR.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick comment from a glance

@@ -597,12 +597,18 @@ impl InvoiceBuilder<tb::False, tb::False, tb::False, tb::False, tb::False, tb::F
/// Construct new, empty `InvoiceBuilder`. All necessary fields have to be filled first before
/// `InvoiceBuilder::build(self)` becomes available.
pub fn new(currency: Currency) -> Self {
let mut features = Bolt11InvoiceFeatures::empty();
features.set_attributable_failures_optional();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we set features elsewhere, afaiu. Setting it here forces everyone using this to include the set features, but its possible someone builds an invoice using this logic that isn't terminating at a ChannelManager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a different place in LDK where this feature can be set, or is it left fully up to the caller to not forget?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invoice builder mostly has setters that set individual flags called by invoice_utils/channelmanager::create_bolt11_invoice. eg basic_mpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave it out of this PR except for the definition of the bits. Until interop is completed we don't need the feature bits.

@joostjager joostjager marked this pull request as draft March 11, 2025 16:37
@joostjager joostjager force-pushed the attr-errs branch 2 times, most recently from f8199d5 to 75959ae Compare March 12, 2025 13:13
Signal to senders that the node will return attributable failures. When
the sender makes sure all path nodes supports this, they will be able to
attribute every failure that may occur.
@joostjager
Copy link
Contributor Author

Ready for a first pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants