Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: make the API harder to misuse #261

Merged
merged 17 commits into from
Feb 6, 2025
Merged

Conversation

MarcoPolo
Copy link
Contributor

@MarcoPolo MarcoPolo commented Jan 11, 2025

This seeks to refactor the codebase to make it much harder to hit nil pointer dereference panics.

This takes a different approach to how we've treated multiaddrs in the past. Instead of attempting to make them a general and performant datastructure, we focus on treating them as just an encoding scheme. Users of multiaddrs are expected to parse the multiaddr into some struct that is suitable for their use case, and use the multiaddr form when interoperating. By treating Multiaddrs as just an encoding scheme we can make a number of simplifications in the codebase. Specifically this PR does the following:

  • Removes the Multiaddrinterface.
  • Multiaddr is now a concrete type of []Component
  • Components no longer implement the Multiaddrinterface as there is none.

Background

This library has had multiple issues related to Multiaddr being an interface. Many methods use and return nil as the zero value, which behaves poorly when the user forgets to do a nil check on every returned value and attempts to call a method on the nil pointer. For example, using Split to split a Multiaddr and then using Join to rebuild the original Multiaddr historically would panic in case one side of the split was nil. Using an interface also leads to incorrect usages of == to check if two Multiaddrs were equal (would only work for pointer equality) and incorrectly using Multiaddr as a key for a map.

Using an interface is typically done to provide a consistent API surface for multiple implementing types. In practice however, the Multiaddr interface was only implemented for multiaddr and component (with arguably some awkwardness when using a component as a Multiaddr).

The better approach is to use a concrete type for a Multiaddr. This lets pointer receiver methods work even if the pointer is nil, since the compiler already knows which function to call. Most methods now take a value rather than a pointer which avoids the issue of a nil pointer dereference completely.

Migration

Refer to ./v015-MIGRATION.md for breaking changes and migration tips

@MarcoPolo MarcoPolo marked this pull request as ready for review January 21, 2025 18:45
@MarcoPolo MarcoPolo requested a review from sukunrt January 21, 2025 18:45
@MarcoPolo MarcoPolo force-pushed the marco/multiaddr-refactor branch from c652d4a to f43a278 Compare January 29, 2025 19:56
@MarcoPolo MarcoPolo requested a review from sukunrt January 29, 2025 19:59
Copy link
Member

@sukunrt sukunrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. ❤️

Left some comments; I don't feel strongly about them.

Comment on lines +214 to +229
// validateComponent MUST be called after creating a non-zero Component.
// It ensures that we will be able to call all methods on Component without
// error.
func validateComponent(c Component) (Component, error) {
_, err := c.valueAndErr()
if err != nil {
return Component{}, err

}
if c.protocol.Transcoder != nil {
err = c.protocol.Transcoder.ValidateBytes([]byte(c.bytes[c.offset:]))
if err != nil {
return Component{}, err
}
}
return c, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT

I'd move this logic inside newComponent and ensure that all components are created by newComponent.

Currently there are 4 places in code where we need to do validateComponent(component{...}) and possibly all of them can be changed to newComponent(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see two usages of validateComponent:

  1. In newComponent.
  2. In readComponent (it happens twice in that method)

These two functions are a bit different. newComponent is expecting a creating a Component from parts, while readComponent is creating a Component from a byte array. Changing these methods to be the same is a bit awkward as Component.bytes is the full bytes representation of the component, so we would introduce a copy somewhere if we did so.

I think it's fine to leave as is.

addr, err := ma.NewMultiaddr("/ip4/1.2.3.4/tcp/80")
// err non-nil when parsing failed.
*/
type Multiaddr interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Comment on lines +180 to +188
func (m Multiaddr) EncapsulateC(c Component) Multiaddr {
if c.Empty() {
return m
}
out := make([]Component, 0, len(m)+1)
out = append(out, m...)
out = append(out, c)
return out
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the multiaddr is immutable, can we just appent to the existing slice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the argument to remove this method or change this to append?

The caller is always free to use append directly. This method does a copy on purpose to avoid sharing the underlying slice. If the caller is okay with that they can append directly.

@MarcoPolo MarcoPolo changed the title refactor code to make the API harder to misuse refactor!: make the API harder to misuse Feb 6, 2025
@MarcoPolo MarcoPolo merged commit 1ef63b5 into master Feb 6, 2025
7 checks passed
MarcoPolo added a commit that referenced this pull request Feb 6, 2025
* add comment

* decorate err

* rename offset field

* more validation checks

* Add benchmark for component validation

* Use *Protocol in Component
@p-shahi p-shahi mentioned this pull request Feb 26, 2025
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants