Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: bad encoding of field with MarshalJSON method #22967

Open
larhun opened this issue Dec 1, 2017 · 12 comments · May be fixed by #68920
Open

encoding/json: bad encoding of field with MarshalJSON method #22967

larhun opened this issue Dec 1, 2017 · 12 comments · May be fixed by #68920
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@larhun
Copy link

larhun commented Dec 1, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

version 1.9

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

windows/amd64

What did you do?

package main

import (
	"encoding/json"
	"os"
)

type T bool

func (v *T) MarshalJSON() ([]byte, error) {
	return []byte{'1'}, nil
}

type S struct {
	X T
}

func main() {
	v := S{true}
	e := json.NewEncoder(os.Stderr)

	e.Encode(v)  // should print {"X":true}
	e.Encode(&v) // should print the same value
}

What did you expect to see?

{"X":true}
{"X":true}

What did you see instead?

{"X":true}
{"X":1}

Issue

The json.Marshal documentation states that:

Pointer values encode as the value pointed to. A nil pointer encodes as the null JSON value.

Thus, &v should be marshaled to the same JSON value as v:

{"X":true}

Moreover, it states that:

If an encountered value implements the Marshaler interface and is not a nil pointer, Marshal calls its MarshalJSON method to produce JSON.

Therefore, Marshal should not call the MarshalJSON method to produce JSON for the X field, because its T type does not implement the json.Marshaler interface. In fact, MarshalJSON has *T receiver and the Go documentation states:

The method set of an interface type is its interface. The method set of any other type T consists of all methods declared with receiver type T. ... The method set of a type determines the interfaces that the type implements ...

As a final remark:

  1. from the source code, the most relevant difference between cases v and &v is that X field becomes addressable in case &v, changing the encoding generated by condAddrEncoder;

  2. implementing MarshalJSON with T value receiver makes T a json.Marshaler interface, with X values properly encoded by MarshalJSON:

    {"X":1}
    {"X":1}
  3. if the intended behavior is that Marshal should anyway call the MarshalJSON method on encoding T values, whenever *T implements the json.Marshaler interface, that should be clearly documented.

@dsnet dsnet added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Dec 1, 2017
@dsnet dsnet added this to the Go1.11 milestone Dec 1, 2017
@dsnet
Copy link
Member

dsnet commented Dec 1, 2017

This does seem like surprising behavior. It should either be documented or the encoder will need to jump through some hoops to copy the value so that it can be addressable.

@larhun
Copy link
Author

larhun commented Dec 3, 2017

This issue is more general and affects all custom encoders accepted by json and xml packages (maybe other?):

    encoding.TextMarshaler
    json.Marshaler
    xml.Marshaler
    xml.MarshalerAttr

More examples at:

It occurs whenever a type T has a custom encoder with *T pointer receiver. In that case something weird and not documented happens on encoding a value x of type T: it does try to get the x address and call the custom encoder. However, if x is not addressable, it cannot succeed.

Is the address resolution really necessary? I do not think so. The custom encoder should only be called if properly implemented, as clearly stated by the Go documentation:

The method set of an interface type is its interface. The method set of any other type T consists of all methods declared with receiver type T. ... The method set of a type determines the interfaces that the type implements ...

The idiomatic and recommended implementation for a custom encoder should be with T value receiver. However, if *T pointer receiver is used, &x (addressable) should be encoded as x (not addressable): a varying behavior is confusing and not acceptable.

@larhun
Copy link
Author

larhun commented Dec 3, 2017

A minimal solution could be: clarify the behavior of each custom encoders by adding that T value receiver is the expected usage and *T pointer receiver may fail to be used.

A definitive solution could be: change the documentation and implementation of the encoding process by first checking if the value under encoding is a pointer, i.e. the first step should be:

Pointer values encode as the value pointed to.

and, therefore, call a custom encoder only if implemented with T value receiver. That way, a custom encoder with *T pointer receiver will never be called.

@rsc
Copy link
Contributor

rsc commented Nov 28, 2018

It seems clear that Encode(&v) should return {"X":1}.
What's less clear is Encode(v). It can't return {"X":1} and today it returns {"X":true} but perhaps it should instead return an error (I can see there's a pointer method but I can't take the address of the value to get a pointer).
Leaving that decision - is this OK or is should Encode return an error? - for Go 1.13.

@rsc rsc modified the milestones: Go1.12, Go1.13 Nov 28, 2018
@rsc rsc added the early-in-cycle A change that should be done early in the 3 month dev cycle. label Nov 28, 2018
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@whyrusleeping
Copy link

I think that I would prefer returning an error in this case rather than exposing such weird behavior, definitely feels like classic 'undefined behavior'.

@josharian
Copy link
Contributor

I just got bitten by this. Another option is to have vet check that MarshalJSON uses a value receiver.

josharian added a commit to tailscale/tailscale that referenced this issue Apr 29, 2021
Pointer receivers used with MarshalJSON are code rakes.

golang/go#22967
dominikh/go-tools#911

I just stepped on one, and it hurt. Turn them over.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
josharian added a commit to tailscale/tailscale that referenced this issue Apr 29, 2021
Pointer receivers used with MarshalJSON are code rakes.

golang/go#22967
dominikh/go-tools#911

I just stepped on one, and it hurt. Turn it over.
While we're here, optimize the code a bit.

name           old time/op    new time/op    delta
MarshalJSON-8     184ns ± 0%      44ns ± 0%  -76.03%  (p=0.000 n=20+19)

name           old alloc/op   new alloc/op   delta
MarshalJSON-8      184B ± 0%       80B ± 0%  -56.52%  (p=0.000 n=20+20)

name           old allocs/op  new allocs/op  delta
MarshalJSON-8      4.00 ± 0%      1.00 ± 0%  -75.00%  (p=0.000 n=20+20)

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
josharian added a commit to tailscale/tailscale that referenced this issue Apr 29, 2021
Pointer receivers used with MarshalJSON are code rakes.

golang/go#22967
dominikh/go-tools#911

I just stepped on one, and it hurt. Turn it over.
While we're here, optimize the code a bit.

name           old time/op    new time/op    delta
MarshalJSON-8     184ns ± 0%      44ns ± 0%  -76.03%  (p=0.000 n=20+19)

name           old alloc/op   new alloc/op   delta
MarshalJSON-8      184B ± 0%       80B ± 0%  -56.52%  (p=0.000 n=20+20)

name           old allocs/op  new allocs/op  delta
MarshalJSON-8      4.00 ± 0%      1.00 ± 0%  -75.00%  (p=0.000 n=20+20)

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
josharian added a commit to tailscale/tailscale that referenced this issue Apr 29, 2021
Pointer receivers used with MarshalJSON are code rakes.

golang/go#22967
dominikh/go-tools#911

I just stepped on one, and it hurt. Turn it over.
While we're here, optimize the code a bit.

name           old time/op    new time/op    delta
MarshalJSON-8     184ns ± 0%      44ns ± 0%  -76.03%  (p=0.000 n=20+19)

name           old alloc/op   new alloc/op   delta
MarshalJSON-8      184B ± 0%       80B ± 0%  -56.52%  (p=0.000 n=20+20)

name           old allocs/op  new allocs/op  delta
MarshalJSON-8      4.00 ± 0%      1.00 ± 0%  -75.00%  (p=0.000 n=20+20)

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
@ash2k
Copy link
Contributor

ash2k commented Aug 19, 2021

Ran into this today and made a quiz for this issue: https://play.golang.org/p/8WejDUNc907

Some of the people I asked guessed the correct answer but nobody was able to explain why it works like that (doesn't call MarshalJSON()). Some were close, but nobody seemed to know about addressability.

I think it should return an error, like Russ is suggesting above.

SaveTheRbtz added a commit to SaveTheRbtz/flow-go that referenced this issue Nov 21, 2021
Using a pointer receiver for `MarshalJSON` is error-prone since custom marshal logic would only be used if addresssible variable is passed, otherwise the default implementaion is used.

See: golang/go#22967
SaveTheRbtz added a commit to SaveTheRbtz/flow-go that referenced this issue Nov 21, 2021
Using a pointer receiver for `MarshalJSON` is error-prone since custom marshal logic would only be used if addressable variable is passed, otherwise the default implementation is used.

See: golang/go#22967
tynes added a commit to ethereum-optimism/optimism that referenced this issue May 5, 2024
github-merge-queue bot pushed a commit to ethereum-optimism/optimism that referenced this issue May 7, 2024
* cannon: remove final dep on bindings

Removes the last dependency that cannon has on `op-bindings/bindings`.
This is done my creating reusable code for reading foundry artifacts
from disk. This code should be reusable between any service that wants
to read the foundry artifacts from disk. This includes roundtrip
tests for JSON serialization of the foundry artifacts.

* op-chain-ops: address semgrep

golang/go#22967
github-merge-queue bot pushed a commit to ethereum-optimism/optimism that referenced this issue May 7, 2024
* cannon: remove final dep on bindings

Removes the last dependency that cannon has on `op-bindings/bindings`.
This is done my creating reusable code for reading foundry artifacts
from disk. This code should be reusable between any service that wants
to read the foundry artifacts from disk. This includes roundtrip
tests for JSON serialization of the foundry artifacts.

* op-chain-ops: address semgrep

golang/go#22967
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/606495 mentions this issue: encoding: add full support for marshalers with pointer receivers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

Successfully merging a pull request may close this issue.