Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requried elements in construction and optional attributes (use="optional") are seialized #555

Closed
amal-khailtash opened this issue Jul 14, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@amal-khailtash
Copy link

I have the following example XSD file with two simple elements and an optional attribute:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="person">
    <xs:annotation>
      <xs:documentation></xs:documentation>
    </xs:annotation>
    <xs:complexType>
      <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
      </xs:sequence>
      <xs:attribute name="smokes" type="xs:boolean" use="optional" default="false">
      </xs:attribute>
    </xs:complexType>
  </xs:element>
</xs:schema>

I used the latest version 21.7 for this example.

from dataclasses import dataclass, field
from typing import Optional


@dataclass
class Person:
    class Meta:
        name = "person"

    firstname: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
        }
    )
    lastname: Optional[str] = field(
        default=None,
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
        }
    )
    smokes: bool = field(
        default=False,
        metadata={
            "type": "Attribute",
        }
    )

I am not sure why firstname and lastname are of type Optional[str] and not just str and required. And smokes attribute is not optional!

  1. Creating an object of type Person without specifying firstname or lastname seems to work even though.
    I expected not specifying lastname be an error, but leaving out smokes should be fine.
>>> person = Person("John")
>>> print(person)
Person(firstname='John', lastname=None, smokes=False)
  1. Serializing person object to either XML or JSON would produce:
<?xml version="1.0" encoding="UTF-8"?>
<person smokes="false">
  <firstname>John</firstname>
</person>
{
  "firstname": "John",
  "lastname": null,
  "smokes": false
}

For the XML case, I expected lastname to always be serialized in XML, and smokes attribute be dropped.
For the JSON case, because smokes attribute has a use="optional" and I think it should dropped.

@tefra
Copy link
Owner

tefra commented Jul 14, 2021

Hi @amal-khailtash

I am not sure why firstname and lastname are of type Optional[str] and not just str and required. And smokes attribute is not optional!

dataclasses require that fields with default values are defined before fields without. There is no way around that since xsdata relies on the mro ordering of the fields for the serialization. That's why xsdata marks them all as optional and adds a hint in the metadata that field is actually required. Currently the hint is only for your eyes only its not validated during parsing/serializing.

I am currently working on an attrs plugins that might help with that but I am open to suggestions on how to tackle the issue for dataclasses as well.

The attribute smokes is optional but also has a default value, which is ambiguous, even when the attribute is missing in the document the model field has to be initialized with the default value, that's why the attribute is not marked as optional.

JSON bindings are tricky because roundtrip conversions should always work and derived types make that hard that's why all fields are serialized. The json serializer supports custom dict factories if you want to drop None values.

@tefra
Copy link
Owner

tefra commented Jul 14, 2021

Apparently kw_only was in added in python 3.10 🕺

For the fields with the default values, I will add a new property in the SerializerConfig to ignore elements/attributes with default values.

@tefra tefra added the enhancement New feature or request label Jul 14, 2021
@amal-khailtash
Copy link
Author

I am not sure if I get this right. My understanding is that constructor, or function calls with default values should always come after the ones without default. So I just made a couple of minor changes and re-coded the dataclass as:

from dataclasses import dataclass, field
from typing import Optional


@dataclass
class Person:
    class Meta:
        name = "person"

    firstname: str = field(
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
        }
    )
    lastname: str = field(
        metadata={
            "type": "Element",
            "namespace": "",
            "required": True,
        }
    )
    smokes: Optional[bool] = field(
        default=False,
        metadata={
            "type": "Attribute",
            "required": False,
        }
    )

Note the addition of Optional and "required": False to smokes and removal of the Optional from firs/last names. Now, I can do this:

from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import JsonParser, XmlParser
from xsdata.formats.dataclass.serializers import JsonSerializer, XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig
from xsdata.formats.dataclass.parsers import XmlParser

from person import Person

config = SerializerConfig(
    pretty_print=True,
    xml_declaration=True,
    no_namespace_schema_location=None,
)

XML_SERIALIZER = XmlSerializer(config=config)

JSON_SERIALIZER = JsonSerializer(context=XmlContext(), indent=2)

person = Person("John", "Doe")
print("Original:")
print(f"  {person}\n")

print("=" * 100)
xml_text = XML_SERIALIZER.render(person)
print("Serialized XML:\n")
print(xml_text)

parser = XmlParser(context=XmlContext())

person_from_xml = parser.from_string(xml_text, Person)
print("Deserialized XML:")
print(f"  {person_from_xml}\n")

print("=" * 100)
json_text = JSON_SERIALIZER.render(person)
print("Serialized JSON:")
print(json_text)

parser_json = JsonParser(context=XmlContext())
person_from_json = parser_json.from_string(json_text, Person)
print("\nDeserialized JSON:")
print(f"  {person_from_json}")

And the output:

Original:
  Person(firstname='John', lastname='Doe', smokes=False)

====================================================================================================
Serialized XML:

<?xml version="1.0" encoding="UTF-8"?>
<person>
  <firstname>John</firstname>
  <lastname>Doe</lastname>
</person>

Deserialized XML:
  Person(firstname='John', lastname='Doe', smokes=False)

====================================================================================================
Serialized JSON:
{
  "firstname": "John",
  "lastname": "Doe"
}

Deserialized JSON:
  Person(firstname='John', lastname='Doe', smokes=False)

I can create a PR for you to review the changes, but basically I added a "required" class member to model.elements.XmlVar, and skip attributes and elements that have required set to False.

@tefra
Copy link
Owner

tefra commented Jul 14, 2021

yeah sure give it a try

@nmrtv
Copy link
Contributor

nmrtv commented Jul 21, 2021

I started converting one project to xsdata and noticed the same problem. For example, UblExtension.ExtensionContent:

    extension_content: Optional[ExtensionContent] = field(
        default=None,
        metadata={
            "name": "ExtensionContent",
            "type": "Element",
            "namespace": "urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2",
            "required": True,
        }
    )

should be generated like this:

    extension_content: ExtensionContent = field(
        metadata={
            "name": "ExtensionContent",
            "type": "Element",
            "namespace": "urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2",
            "required": True,
        }
    )

Oh, never mind, I found the problem is with kw arguments.

But we can introduce default sentinel values for such fields, at least mypy will detect accidental None:

from dataclasses import dataclass, field
from typing import Any, Optional, cast


class _MISSING_REQUIRED:
    """Missing parameter sentinel for dataclass objects."""

    def __repr__(self) -> str:
        return "MISSING"


MISSING_REQUIRED = cast(Any, _MISSING_REQUIRED())


@dataclass
class Test:
    param1: Optional[str] = None
    param2: str = field(default=MISSING_REQUIRED)


test = Test(param1="****", param2=None)
print(test)

@nmrtv
Copy link
Contributor

nmrtv commented Jul 30, 2021

I tried with attrs plugin and for me this issue is not actual anymore.

@amal-khailtash
Copy link
Author

@neriusmika how do you mean? Is there such attr plugin for xsdata?

@nmrtv
Copy link
Contributor

nmrtv commented Jul 31, 2021

@amal-khailtash, yes, repository is here https://github.com/tefra/xsdata-attrs/ I think it's still not finished, but I didn't find any serious issues.

@tefra
Copy link
Owner

tefra commented Aug 2, 2021

Thanks for reporing @amal-khailtash

I just merged #575 to ignore optional attributes with default values that you can enable in order to achieve half of what you 've been looking. You are correct optional attributes with default values can be omitted and the document still validates.

Elements are a bit more complicated, most of the time they are part of a sequence which means they need to appear in the document in the correct order we can't just omit them. That's why in your pull request all the W3C tests failed.

That being said, I added the required property to the xml variables, you can extend the internal serializes and apply whatever logic you want.

Also as @neriusmika said in order to get truly required properties you need to switch to either python 3.10 or the attrs plugins and enable the kw_only generator config. That way you will get a proper exception if you try to initialize an object without a required property.

I will try to investigate a solution for python <= 3.9 maybe we can do something similar as to what @neriusmika suggested, but for now my priority is the attrs plugins.

@tefra tefra closed this as completed Aug 2, 2021
@tefra
Copy link
Owner

tefra commented Aug 2, 2021

For a demo on the new config option check the samples repo
tefra/xsdata-samples#68

@amal-khailtash
Copy link
Author

That is awesome! Thanks @tefra, I will give those a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants