Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: decoding into existing map pointer values unexpectedly reallocates them #31924

Closed
as opened this issue May 8, 2019 · 13 comments
Closed
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@as
Copy link
Contributor

as commented May 8, 2019

When decoding JSON data into an existing map object, an existing key's value is reallocated. I can't determine conclusively whether or not this is expected behavior, but the doc seems to be
unclear on this case.

The example below should explain this in a better way:

https://play.golang.org/p/y_VMAgevTNg

It is unexpected that B and A (as variables) lose coherence after the call to json.Unmarshal. The call reallocates "A.B", creating a copy of it. After that call, the objects are independent, which may be unexpected behavior.

What did you expect to see?

1
{"B":5}
{"A":{"B":5}}
2
{"B":6}
{"A":{"B":6}}
3
2009/11/10 23:00:00 addr(A=0x414030 C=0x43e260 C[A]=0x414030)
2009/11/10 23:00:00 addr(A=0x414030 C=0x43e260 C[A]=0x414030)
{"B":7}
{"A":{"B":7}}
4
{"B":16}
{"A":{"B":16}}

What did you see instead?

1
{"B":5}
{"A":{"B":5}}
2
{"B":6}
{"A":{"B":6}}
3
2009/11/10 23:00:00 addr(A=0x414030 C=0x43e260 C[A]=0x414030)
2009/11/10 23:00:00 addr(A=0x414030 C=0x43e260 C[A]=0x414140)
{"B":6}
{"A":{"B":7}}
4
{"B":16}
{"A":{"B":7}}

Possibly relevant godoc from encoding/json

Specifically, this part:

Unmarshal unmarshals the JSON into the value pointed at by the pointer.

To me implies that it does not allocate a new value and set the pointer to point to it, but instead uses the existing value pointed at.

    Unmarshal uses the inverse of the encodings that Marshal uses,
    allocating maps, slices, and pointers as necessary, with the following
    additional rules:

    To unmarshal JSON into a pointer, Unmarshal first handles the case of
    the JSON being the JSON literal null. In that case, Unmarshal sets the
    pointer to nil. Otherwise, Unmarshal unmarshals the JSON into the value
    pointed at by the pointer. If the pointer is nil, Unmarshal allocates a
    new value for it to point to.

    To unmarshal a JSON object into a map, Unmarshal first establishes a map
    to use. If the map is nil, Unmarshal allocates a new map. Otherwise
    Unmarshal reuses the existing map, keeping existing entries. Unmarshal
    then stores key-value pairs from the JSON object into the map. The map's
    key type must either be a string, an integer, or implement
    encoding.TextUnmarshaler.
@as as changed the title encoding/json: decoding existing map pointer values unexpectedly reallocates them encoding/json: decoding into existing map pointer values unexpectedly reallocates them May 8, 2019
@josharian
Copy link
Contributor

cc @mvdan

@cuonglm
Copy link
Member

cuonglm commented May 9, 2019

If you try:

json.Unmarshal([]byte(`{"F":{"B":7}}`), &C)

You will see that C["A"] was kept.

So writing new value in this case seems to be ok for me. The spec is not clear about it.

@seebs
Copy link
Contributor

seebs commented May 9, 2019

If a map is already present, we expect it to populate provided keys but not remove existing keys that weren't overwritten.

If a struct is already present, we sometimes expect it to populate provided keys but not zero out existing keys.

But if a map contains a struct (or pointer to a struct), it appears that the outcome is to replace the key entirely, rather than to populate recursively. Contrast with what you might expect for type A struct { X, Y int }; type B struct { A1, A2 A }. In that case, I'd expect {A1: {"X": 1}} not to overwrite A1.Y...

And it sort of makes sense to me that, since map values aren't addressable, a map[string]struct would just replace the struct with a new struct, because trying to behave otherwise is actually pretty hard. But when it's a map[string]*struct, it's not insane to think it should act like populating a struct would normally.

Seems to be the samme with map[string]map[string]int, etc., which is to say, the recursion of decoding is not the same as you'd get by recursing yourself. If you have a map key-value pair "A": ..., and the map already has a pointer in A, the behavior you get from calling decode with that is not the same as you'd get from calling decode with the ... on the pointer stored in ["A"]. Which is different from how structs behave.

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 13, 2019
@andybons andybons added this to the Unplanned milestone May 13, 2019
@mvdan
Copy link
Member

mvdan commented May 29, 2019

Here is what I think is a simpler example: https://play.golang.org/p/gnyc9t1kleB

It shows that a map value is replaced, and thus the original value is left untouched, and the pointers differ. Edit: it's replaced with a zero value, so all other fields are lost.

When a struct is used, the same pointer is reused, so the original value changes.

I tend to agree that both cases should behave the same here, but it's hard to tell if that's a good idea before digging into the code.

@mvdan
Copy link
Member

mvdan commented May 29, 2019

I've convinced myself that the current behavior is simply by accident. I've worked on a fix, and no existing json tests fail. I also like the new logic much better, as it's more consistent with structs and the general package behavior.

I've also made non-pointer elements keep their existing values too, by making a copy of the value and assigning it back to its key.

@mvdan mvdan modified the milestones: Unplanned, Go1.14 May 29, 2019
@mvdan mvdan self-assigned this May 29, 2019
@mvdan mvdan added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 29, 2019
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/179337 mentions this issue: encoding/json: reuse values when decoding map elements

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@mvdan
Copy link
Member

mvdan commented Oct 10, 2019

This has a CL ready for review, and I think there's consensus that we should do this, so I'm moving it back to the 1.14 milestone.

@dsnet
Copy link
Member

dsnet commented May 19, 2020

FYI, this changes made for this issue are possibly going to be reverted. See #39149.

@ianlancetaylor
Copy link
Member

The fix was rolled back, so reopening this issue.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.15, Backlog May 29, 2020
@mvdan
Copy link
Member

mvdan commented Jun 6, 2020

I'm not sure if there's much else to do here for the current encoding/json. As the revert issue points out, too much existing code depends on the current behavior, so we can't possibly change it at this point without breaking the compatibility promise.

I would definitely add this issue to the bag of design issues to consider in a future json API redesign, though. How should we keep track of those? I've seen some others in the form of proposals in "held" status, but I don't think this issue should be a proposal. It's just a design inconsistency bug.

@stupidjohn
Copy link

how about use a special function to indicate not allocate when already allocated. just like UseNumber ?

@mvdan
Copy link
Member

mvdan commented Aug 19, 2020

I don't think adding options for historical design tech debt is a good idea. The problem with that kind of debt is that the API is less consistent and a bit harder to use, but one can usually still write code around the issue. If we add an option, to fix the debt and avoid breaking backwards compatibility, the package is still not consistent (by default) and, arguably, it's even harder to use for someone new to Go.

@rsc rsc unassigned mvdan Jun 23, 2022
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
moskyb added a commit to buildkite/go-buildkite that referenced this issue Oct 16, 2024
… than unmarshalling into the input

encoding/json behaves surprisingly ([#27172](golang/go#27172), [#31924](golang/go#31924), [#26946](golang/go#26946), [#21092](golang/go#21092)) when unmarshalling into non-empty structs, and this behaviour is unexpected in this library anyway; no other methods do any user-facing unmarshalling or pointer mutation.

This commit changes all of the `Update` methods that unmarshall into their inputs to no longer do that, and return a newly-allocated struct instead.
@seankhliao
Copy link
Member

closing as working as intended then.

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests