Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.0.0[deprecated] #49

Merged
merged 72 commits into from
Jul 11, 2024
Merged

v2.0.0[deprecated] #49

merged 72 commits into from
Jul 11, 2024

Conversation

MBueschelberger
Copy link
Member

@MBueschelberger MBueschelberger commented Apr 11, 2024

  • introduce new simplified mapping file schema
  • introduce qudt and csvw for table annotation
  • introduce pydantic into the package
  • make method graph optional
  • introduce new standard method graph using provo
  • general package maintainance
  • refactor modules
  • add utility to read qudt-mapping from unit-symbols
  • add utility to generate subgraph for quantity description using qudt
  • support json files as input for mapping files
  • drop unneeded intermediate file outputs
  • update excel parser
  • add json parser
  • update documentation

@MBueschelberger MBueschelberger marked this pull request as draft April 11, 2024 13:35
@yoavnash
Copy link
Member

I see it's still marked as draft so I'm waiting with my review for now.

@MBueschelberger MBueschelberger marked this pull request as ready for review May 2, 2024 12:14
)

@property
def suffix(cls) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't self instead of cls be used for the property annotation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I don't know how I came to cls.

See the official built-in docs for Python

iri: AnyUrl = Field(
..., description="Ontological class related to this concept"
)
config: Config = Field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it mean that the config object will be linked to each mapping entry? If so, that seems quite redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the mapping models in this module should be treated as standalone in order to be independently reused in different parsers, especially the QuantityMappings.

The Config object here is inherited to all child classes of the BasicConceptMapping since all of them are depending on it for building the subgraph. In order to ensure that the same config is used everywhere in the overall package, the same config object is later passed from the pipeline to the parser and from the parser to the mapping.

Does this make sense to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered these approaches?

  1. Have an upper class that contains the config and the entries (i.e., the ConceptMapping instances) as fields.
  2. Put the config instance in the __init__.py so that it accessible to all the instances.

I do think your approach is valid, but was just wondering about the alternatives.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overwriting the __init__ for a pydantic BaseModel is rather something which should be avoided and only be done if really necessary, like we do e.g in the DSMS SDK.

The default_factory in this model here is making an Config-instance for your model if not provided by the child classes/ by the user. This is generlly a better practise in pydantic.

This actually does the same job as instanciating at __init__ if not provided as kwarg by the child classes.

Is this what you tried to explain as alternative?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed now that we decided we don't want to support chowlk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, still have to do this.

from data2rdf import BasicConceptMapping


class AnnotationPipeline(BaseModel):
Copy link
Member

@yoavnash yoavnash May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming this class name to Data2Rdf to make it more straightforward?

"""Return dict of json-ld for graph"""
return {
"@context": {
"fileid": make_prefix(cls.config),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileid is somewhat confusing. Can a better name be found?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like base_uri?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something that is more self-explanatory. How about simply data?

default_factory=Config, description="Configuration object"
)

extra_triples: Optional[Union[str, Graph]] = Field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly additional sounds better in English than extra.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding at the beginning of this file a text block with a table of contents and a short explanation will be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v6 could be removed from the filename.

None, description="Symbol or QUDT IRI for the mapping"
)
annotation: Optional[Union[str, AnyUrl]] = Field(
None, description="Base IRI with which the value shall be concatenated"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use concatenation and not simply provide the full IRI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because we want to concatenate with the value of the respective concept.

e.g.
value ="DP800"
annotation = "https://w3id.org/steel/ProcessOntology/"
-> iri = https://w3id.org/steel/ProcessOntology/DP800

@github-actions github-actions bot merged commit bf185b0 into main Jul 11, 2024
3 checks passed
@MBueschelberger MBueschelberger changed the title v2.0.0 v2.0.0[deprecated] Jul 11, 2024
@MBueschelberger MBueschelberger mentioned this pull request Jul 11, 2024
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants