Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrypt Trampoline error onions #3657

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

arik-so
Copy link
Contributor

@arik-so arik-so commented Mar 10, 2025

No description provided.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Mar 10, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link
Contributor Author

@arik-so arik-so left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Draft for now due to the clones, but am open to suggestions.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@arik-so arik-so requested a review from joostjager March 10, 2025 06:38
@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch from dd8098b to af31dd1 Compare March 10, 2025 06:47
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

Attention: Patch coverage is 96.69211% with 13 lines in your changes missing coverage. Please review.

Project coverage is 89.25%. Comparing base (4c43a5b) to head (e7d5c53).

Files with missing lines Patch % Lines
lightning/src/ln/onion_utils.rs 96.66% 7 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3657      +/-   ##
==========================================
+ Coverage   89.24%   89.25%   +0.01%     
==========================================
  Files         155      155              
  Lines      119280   119584     +304     
  Branches   119280   119584     +304     
==========================================
+ Hits       106446   106735     +289     
- Misses      10240    10247       +7     
- Partials     2594     2602       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch 2 times, most recently from 9caa2b6 to a2dc11f Compare March 10, 2025 07:25
// Handle packed channel/node updates for passing back for the route handler
let callback = |shared_secret, _, _, route_hop_opt: Option<&RouteHop>, route_hop_idx| {
for (route_hop_idx, (route_hop_option, shared_secret)) in onion_keys.into_iter().enumerate() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this refactoring. And I am also wondering if you can take it another step further by extracting this main for loop into a function that returns the FailureLearnings. To avoid this modification of variables outside of the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely doable, but I first wanna make sure the approach is sound

Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall clean approach where indeed failure message processing isn't touched all that much. Perhaps add some more comments explaining the code, especially for devs who do not fully understand trampoline, and isolate a bit more refactor into separate commits.

One question that I haven't fully answered is how this intersects with attributable failures. That pertains mainly to the spec level of course.

if res.is_some() {
return;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res.is_some is never going to happen now with the breaks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out it actually was, but I made it a bit more obvious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think all of these breaks would be avoidable with a simple return statement though once the loop is in its own function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that would also help with accidentally missing a break in some future change

cltv_expiry_delta: 36,
}
],
hops: vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So here a combination is made of three trampoline nodes, where the last one is also the start of the blinded path towards the final destination? Maybe useful to add a high level comment explaining this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments explaining that that particular unit test is just for testing cryptograph, and not for the test vectors. I added a separate unit test for the test vectors, but we still need intermediate Trampoline hop forwarding support to properly handle those components of the test vector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can both tests be merged, possibly by extending the test vectors?

path.blinded_tail.as_ref(),
session_priv,
blinded_tail,
outer_session_priv.as_ref().unwrap_or(session_priv),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look twice at why outer_session_priv is set to None for non-trampoline, and is again unwrap_or'ed here. Perhaps it is easier to follow if setting this specific key also near let (blinded_tail, outer_session_priv) = ...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, optimally all of these Trampoline-specific overrides would be done in one location, I'm looking how to best combine them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something can be gained just with naming. By always have an outer_session_priv and optionally also an inner_session_priv?

},
)
.expect("Route we used spontaneously grew invalid keys in the middle of it?");

path.blinded_tail.as_ref().map(|bt| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments explaining what's happening could be helpful. The goal is I think to create a list of onion keys that is the concatenation of the path to the first trampoline, the trampoline nodes, and then the blinded path?

@joostjager
Copy link
Contributor

joostjager commented Mar 10, 2025

It might be nice to link to a tracking issue that links all trampoline work together?

@arik-so arik-so mentioned this pull request Mar 3, 2025
30 tasks
@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch from a2dc11f to e7d5c53 Compare March 11, 2025 05:39
@@ -943,6 +943,7 @@ fn decrypt_onion_error_packet(packet: &mut Vec<u8>, shared_secret: SharedSecret)
#[inline]
pub(super) fn process_onion_failure<T: secp256k1::Signing, L: Deref>(
secp_ctx: &Secp256k1<T>, logger: &L, htlc_source: &HTLCSource, mut encrypted_packet: Vec<u8>,
secondary_session_priv: Option<SecretKey>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is just for testing, it shouldn't be on the public interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps an internal façade would do the trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, actually, the interface isn't public, it's just visible to the crate

@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch 2 times, most recently from d3cef50 to 3e45cb2 Compare March 11, 2025 20:54
arik-so added 2 commits March 11, 2025 14:22
In an upcoming commit, we will need to decrypt error onions constructed
from multiple session_privs. In order to simplify the code legibility,
we move from a single-iteration model to one where we first aggregate
the shared secrets, and then use them for the error decryption.
We currently check whether our hop is the last in the path by accessing
the hops vector by the next index. However, once we start handling
Trampoline hops that will become inadequate. Instead, we switch it to
check whether there is a subsequent element in the iterator.
@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch from 3e45cb2 to 36f8514 Compare March 11, 2025 21:34
@arik-so arik-so marked this pull request as ready for review March 11, 2025 21:36
arik-so added 4 commits March 11, 2025 18:26
When we start handling Trampoline, the hops in our error decryption path
could be either `RouteHop`s or `TrampolineHop`s. To avoid excessive code
duplication, we introduce an enum with some methods for common accessors.
We don't need to recalculate the blinded hop count on each iteration,
and clarify the meaning of `is_from_final_hop` to mean that it refers
to all non-blinded hops, including Trampoline.
Rather than solely iterating over `RouteHop`s, we now also append the
shared secrets from the inner onion containing `TrampolineHop`s.
Create unit tests covering the hybrid outer and inner onion shared
secrets, as well as the Trampoline error test vectors.

Additionally, we allow the `outer_session_priv` to be overridden to
accommodate the test vector requirements.
@arik-so arik-so force-pushed the arik/trampoline/error-decryption branch from 36f8514 to 83e58e9 Compare March 12, 2025 01:27
@arik-so arik-so requested a review from joostjager March 12, 2025 06:36
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major points. Still really happy with the commit structure.

if res.is_some() {
return;
}
let mut onion_keys = Vec::with_capacity(path.hops.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the length of the blinded tail be added here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I move out the blinded_hop_count to optimize the initial vector capacity allocation, the pre-Trampoline optimization commit becomes just the is_final_hop rename, which I think should just be squashed into the subsequent commit that sees the introduction of Trampoline nodes

onion_keys.push((route_hop_option.cloned(), shared_secret))
},
)
.expect("Route we used spontaneously grew invalid keys in the middle of it?");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it doesn't matter what text is in here, but isn't this an implementation detail that the caller can't really know?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is. Really it is for us, I think, because this should never be getting hit

@@ -1008,13 +1009,13 @@ where
// from the current hop (i.e., the next hop's inbound channel).
let num_blinded_hops = path.blinded_tail.as_ref().map_or(0, |bt| bt.hops.len());
// For 1-hop blinded paths, the final `path.hops` entry is the recipient.
is_from_final_node = route_hop_idx + 1 == path.hops.len() && num_blinded_hops <= 1;
is_from_final_node = iterator.peek().is_none() && num_blinded_hops <= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterator.peek().is_none() - is this really the same as route_hop_idx + 1 == path.hops.len(), because the latter doesn't include the blinded tail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is only the same when num_blinded_hops <= 1, but because that's also a condition, the end result is equivalent

match path.hops.get(route_hop_idx + 1) {
Some(hop) => hop,
None => {
match iterator.peek() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store peek in a variable instead of peeking twice (here and above)?

If you're touching this code anyway, it might benefit from a comment explaining why we are now getting the next hop.

path.blinded_tail.as_ref(),
session_priv,
blinded_tail,
outer_session_priv.as_ref().unwrap_or(session_priv),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something can be gained just with naming. By always have an outer_session_priv and optionally also an inner_session_priv?

|shared_secret, _, _, route_hop_option: Option<&RouteHop>, _| {
onion_keys.push((route_hop_option.map(|rh| ErrorHop::RouteHop(rh)), shared_secret))
},
)
.expect("Route we used spontaneously grew invalid keys in the middle of it?");

if path.has_trampoline_hops() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this method can become get_trampoline_hops returning an option, and then only calling it once at the top and reusing the result? That would avoid &path.blinded_tail.as_ref().unwrap().trampoline_hops below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have a blinded tail, the number of Trampoline hops within it could still be 0. Should this method then return None, or Some with an empty array?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, there are two cases. Yes, this is just a thought. If you think it doesn't really solve anything in terms or readability, just ignore.

let session_priv_hash = Sha256::hash(&session_priv.secret_bytes()).to_byte_array();
SecretKey::from_slice(&session_priv_hash[..]).expect("You broke SHA-256!")
secondary_session_priv.unwrap_or_else(|| {
let session_priv_hash = Sha256::hash(&session_priv.secret_bytes()).to_byte_array();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an alternative is to now move the hashing to the caller, so that this function just always gets a secondary key (not just for testing)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I think it's better for the hashing to happen inside the method so the caller needs not be aware of the process, and I can see if I can hide the secondary_session_priv behind a test cfg.

payment_id: PaymentId([1; 32]),
};

{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment per test case would be nice here. Reorg worked out well.

if res.is_some() {
return;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that would also help with accidentally missing a break in some future change

cltv_expiry_delta: 36,
}
],
hops: vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can both tests be merged, possibly by extending the test vectors?

@@ -987,7 +987,8 @@ where
.expect("Route we used spontaneously grew invalid keys in the middle of it?");

// Handle packed channel/node updates for passing back for the route handler
for (route_hop_idx, (route_hop_option, shared_secret)) in onion_keys.into_iter().enumerate() {
let mut iterator = onion_keys.into_iter().peekable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this commit is really the best idea. I've been (incompletely) rebasing attributable failures on top, and there I need that route_hop_idx again to index into the hmacs and payloads.

main...joostjager:rust-lightning:attr-errs-on-trampoline

Copy link
Contributor Author

@arik-so arik-so Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's an example commit that adds an index to the peekable enumeration: 2ef4f27

Would that at all be helpful for attribution? Given our offline discussion, you'd still need to check whether the hop is regular or Trampoline, I think, but it should make it simpler.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Don't have any feedback on the first pass. Commit history is indeed very reviewable. Will take another look and more closely at the tests tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants