Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty CSV string deserialized as String containing "null" instead of null String #330

Closed
hanlonjohn23 opened this issue Aug 29, 2022 · 7 comments
Labels
csv to-evaluate Issue that has been received but not yet evaluated

Comments

@hanlonjohn23
Copy link

hanlonjohn23 commented Aug 29, 2022

Describe the bug
When attempting to deserialize a CSV with empty strings to an object with string fields, the fields are populated with "null" instead of the expected null. Note that this is using the CsvParser.Feature.EMPTY_STRING_AS_NULL feature.

Version information
2.13.3

To Reproduce
My project is using this code to deserialize CSV into a set of POJO objects:

public static <T> Set<T> deserializeSetFromCsvAtPath(Path path, Class<T> clazz) throws IOException {
    Set<T> set = new HashSet<>();

    try (InputStream is = Files.newInputStream(path)) {
        CsvMapper mapper = new CsvMapper();
        mapper.enable(CsvParser.Feature.EMPTY_STRING_AS_NULL);

        CsvSchema headerSchema = CsvSchema.emptySchema().withHeader();

        MappingIterator<T> iterator = mapper
                .readerFor(clazz)
                .with(headerSchema)
                .readValues(is);

        while (iterator.hasNextValue()) {
            set.add(iterator.nextValue());
        }

    } catch (IOException e) {
        throw new IOException("Failed to input stream file at " + path, e);
    }

    return set;
}

We are expecting that given a class like so:

public class Row {
    private final Integer id;
    private final String value;
. . .
}

We would deserialize a CSV like this:

id,value
1,

Into an object containing

id: 1
value: null

Instead, we are experiencing

id: 1
value: "null"

Expected behavior
That given the code above, empty strings in a CSV will be deserialized to a null String object, instead of a String containing "null".

Additional context
Similar to this issue here, only in a different context. Furthermore, I assume that the underlying issue is the same:

Use of parser.getText() unfortunately did include "null" as value. So better choice is parser.getValueAsString() which only coerces scalars, but returns null for JSON null as well as for structured tokens (START_ARRAY, START_OBJECT).

@hanlonjohn23 hanlonjohn23 added the to-evaluate Issue that has been received but not yet evaluated label Aug 29, 2022
@hanlonjohn23 hanlonjohn23 changed the title null String for empty CSV string deserialized as String "null" instead of null Empty CSV string deserialized as String containing "null" instead of null String Aug 30, 2022
@hanlonjohn23 hanlonjohn23 changed the title Empty CSV string deserialized as String containing "null" instead of null String Empty CSV string deserialized as String containing "null" instead of null String Aug 30, 2022
@cowtowncoder
Copy link
Member

Will move to the right repository for CSV format issues.

@cowtowncoder cowtowncoder transferred this issue from FasterXML/jackson-databind Aug 31, 2022
@cowtowncoder
Copy link
Member

Thank you for reporting this: it sounds like a possible bug. Thank you for also including reproduction.
I'll see if I can add a unit test and possibly resolve this (or at very least understand what is going on).

@cowtowncoder
Copy link
Member

cowtowncoder commented Aug 31, 2022

Hmmmh. Unfortunately I am unable to reproduce this on my side, with test like:

    final CsvMapper MAPPER = mapperForCsv();

    static class Row330 {
        public Integer id;
        public String value = "default";
    }

    public void testEmptyStringAsNull330() throws Exception
    {
        CsvSchema headerSchema = CsvSchema.emptySchema().withHeader();
        final String DOC = "id,value\n"
                + "1,\n";

        MappingIterator<Row330> iterator = MAPPER
                .readerFor(Row330.class)
                .with(CsvParser.Feature.EMPTY_STRING_AS_NULL)
                .with(headerSchema)
                .readValues(DOC);        
        Row330 row = iterator.next();

        assertEquals(Integer.valueOf(1), row.id);
        assertNull(row.value);
    }

so I think I'd need some help here.

cowtowncoder added a commit that referenced this issue Aug 31, 2022
@hanlonjohn23
Copy link
Author

hanlonjohn23 commented Aug 31, 2022

Thanks for your speedy response @cowtowncoder!

So I looked in to this some more this morning, and I managed to track down the source of the issue I was experiencing.
As part of my environment, I was using a custom deserializer like so
Row330.java:

import com.fasterxml.jackson.databind.annotation.JsonDeserialize;

@JsonDeserialize(using = Row330Deserializer.class)
public class Row330 {
    public Integer id;
    public String value = "default";

    public Row330(
            Integer id,
            String value
    ) {
        this.id = id;
        this.value = value;
    }
}

Row330Deserializer.java:

import com.fasterxml.jackson.core.JacksonException;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.JsonDeserializer;
import com.fasterxml.jackson.databind.JsonNode;

import java.io.IOException;

public class Row330Deserializer extends JsonDeserializer<Row330> {
    @Override
    public Row330 deserialize(JsonParser p, DeserializationContext ctxt) throws IOException, JacksonException {
        JsonNode node = p.readValueAsTree();
        return new Row330(
                node.get("id").asInt(),
                node.get("value").asText()
        );
    }
}

When using this set up, I believe your test above will fail, as value will be deserialized as "null".
Note that using .textValue() in place of .asText() results in a null String as expected, so I don't know if this is a bug, or simply expected behavior that I couldn't find documented.

@cowtowncoder
Copy link
Member

I think it works as expected: asText() would give String "null" for NullNode; this should be specified by Javadocs in JsonNode (haven't checked; PR against Javadocs would be welcome if not).

Whether that makes sense or not would be an issue for jackson-databind, but not something CSV module would have control over.

@hanlonjohn23
Copy link
Author

Understandable. Thanks for helping me with my problem @cowtowncoder, even though it turned out to not be a bug!

@cowtowncoder
Copy link
Member

No problem @jhanlonhg ! Just glad we were able to figure it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csv to-evaluate Issue that has been received but not yet evaluated
Projects
None yet
Development

No branches or pull requests

2 participants