refactor!: make the API harder to misuse #261

MarcoPolo · 2025-01-11T20:08:26Z

This seeks to refactor the codebase to make it much harder to hit nil pointer dereference panics.

This takes a different approach to how we've treated multiaddrs in the past. Instead of attempting to make them a general and performant datastructure, we focus on treating them as just an encoding scheme. Users of multiaddrs are expected to parse the multiaddr into some struct that is suitable for their use case, and use the multiaddr form when interoperating. By treating Multiaddrs as just an encoding scheme we can make a number of simplifications in the codebase. Specifically this PR does the following:

Removes the Multiaddrinterface.
Multiaddr is now a concrete type of []Component
Components no longer implement the Multiaddrinterface as there is none.

Background

This library has had multiple issues related to Multiaddr being an interface. Many methods use and return nil as the zero value, which behaves poorly when the user forgets to do a nil check on every returned value and attempts to call a method on the nil pointer. For example, using Split to split a Multiaddr and then using Join to rebuild the original Multiaddr historically would panic in case one side of the split was nil. Using an interface also leads to incorrect usages of == to check if two Multiaddrs were equal (would only work for pointer equality) and incorrectly using Multiaddr as a key for a map.

Using an interface is typically done to provide a consistent API surface for multiple implementing types. In practice however, the Multiaddr interface was only implemented for multiaddr and component (with arguably some awkwardness when using a component as a Multiaddr).

The better approach is to use a concrete type for a Multiaddr. This lets pointer receiver methods work even if the pointer is nil, since the compiler already knows which function to call. Most methods now take a value rather than a pointer which avoids the issue of a nil pointer dereference completely.

Migration

Refer to ./v015-MIGRATION.md for breaking changes and migration tips

multiaddr.go

util.go

sukunrt

This looks great. ❤️

Left some comments; I don't feel strongly about them.

codec.go

component.go

sukunrt · 2025-02-04T14:35:19Z

component.go

+// validateComponent MUST be called after creating a non-zero Component.
+// It ensures that we will be able to call all methods on Component without
+// error.
+func validateComponent(c Component) (Component, error) {
+	_, err := c.valueAndErr()
+	if err != nil {
+		return Component{}, err
+
+	}
+	if c.protocol.Transcoder != nil {
+		err = c.protocol.Transcoder.ValidateBytes([]byte(c.bytes[c.offset:]))
+		if err != nil {
+			return Component{}, err
+		}
 	}
+	return c, nil


NIT

I'd move this logic inside newComponent and ensure that all components are created by newComponent.

Currently there are 4 places in code where we need to do validateComponent(component{...}) and possibly all of them can be changed to newComponent(...)

I only see two usages of validateComponent:

In newComponent.

In readComponent (it happens twice in that method)

These two functions are a bit different. newComponent is expecting a creating a Component from parts, while readComponent is creating a Component from a byte array. Changing these methods to be the same is a bit awkward as Component.bytes is the full bytes representation of the component, so we would introduce a copy somewhere if we did so.

I think it's fine to leave as is.

sukunrt · 2025-02-04T14:35:44Z

interface.go

-	addr, err := ma.NewMultiaddr("/ip4/1.2.3.4/tcp/80")
-	// err non-nil when parsing failed.
-*/
-type Multiaddr interface {


matest/matest.go

sukunrt · 2025-02-04T14:45:58Z

multiaddr.go

+func (m Multiaddr) EncapsulateC(c Component) Multiaddr {
+	if c.Empty() {
 		return m
 	}
+	out := make([]Component, 0, len(m)+1)
+	out = append(out, m...)
+	out = append(out, c)
+	return out
+}


As the multiaddr is immutable, can we just appent to the existing slice?

Is the argument to remove this method or change this to append?

The caller is always free to use append directly. This method does a copy on purpose to avoid sharing the underlying slice. If the caller is okay with that they can append directly.

* add comment * decorate err * rename offset field * more validation checks * Add benchmark for component validation * Use *Protocol in Component

MarcoPolo added 8 commits January 10, 2025 16:14

Remove Multiaddr interface

7bc8264

Refactor to make Multiaddr = []Component

eeaa191

update net package

a0b10fa

remove extra code

ef9995f

Add breaking text

0e45aea

remove panic

2b74265

add test for using nil multiaddr

4c69e16

Handle other error cases

4ab593a

2color mentioned this pull request Jan 13, 2025

[DISCUSSION] Exposing the underlying struct #100

Open

MarcoPolo added 8 commits January 14, 2025 14:01

skip empty components

c268d44

Remove ForEach usage

5c40c40

Encapsulate is the same as Join

fbef51e

Use nil as zero value for Multiaddr

3ca4833

Add matest package for multiaddr testing utilities

f9bebb2

remove ForEach usages

e32f70d

undo the deprecate ForEach until we have meg

3663997

explicit err check

293f2c0

MarcoPolo marked this pull request as ready for review January 21, 2025 18:45

MarcoPolo requested a review from sukunrt January 21, 2025 18:45

sukunrt reviewed Jan 27, 2025

View reviewed changes

multiaddr.go Outdated Show resolved Hide resolved

multiaddr.go Show resolved Hide resolved

util.go Show resolved Hide resolved

nits

f43a278

MarcoPolo force-pushed the marco/multiaddr-refactor branch from c652d4a to f43a278 Compare January 29, 2025 19:56

MarcoPolo requested a review from sukunrt January 29, 2025 19:59

sukunrt approved these changes Feb 4, 2025

View reviewed changes

MarcoPolo changed the title ~~refactor code to make the API harder to misuse~~ refactor!: make the API harder to misuse Feb 6, 2025

MarcoPolo merged commit 1ef63b5 into master Feb 6, 2025
7 checks passed

MarcoPolo added a commit that referenced this pull request Feb 6, 2025

refactor: Follows up on #261 (#264)

46805b0

* add comment * decorate err * rename offset field * more validation checks * Add benchmark for component validation * Use *Protocol in Component

p-shahi mentioned this pull request Feb 26, 2025

implement a new multiaddress API #198

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor!: make the API harder to misuse #261

refactor!: make the API harder to misuse #261

MarcoPolo commented Jan 11, 2025 •

edited

Loading

sukunrt left a comment

sukunrt Feb 4, 2025

MarcoPolo Feb 6, 2025

sukunrt Feb 4, 2025

sukunrt Feb 4, 2025

MarcoPolo Feb 6, 2025

refactor!: make the API harder to misuse #261

refactor!: make the API harder to misuse #261

Conversation

MarcoPolo commented Jan 11, 2025 • edited Loading

Background

Migration

sukunrt left a comment

Choose a reason for hiding this comment

sukunrt Feb 4, 2025

Choose a reason for hiding this comment

MarcoPolo Feb 6, 2025

Choose a reason for hiding this comment

sukunrt Feb 4, 2025

Choose a reason for hiding this comment

sukunrt Feb 4, 2025

Choose a reason for hiding this comment

MarcoPolo Feb 6, 2025

Choose a reason for hiding this comment

MarcoPolo commented Jan 11, 2025 •

edited

Loading