-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.0.0[deprecated] #49
Conversation
MBueschelberger
commented
Apr 11, 2024
•
edited
Loading
edited
- introduce new simplified mapping file schema
- introduce qudt and csvw for table annotation
- introduce pydantic into the package
- make method graph optional
- introduce new standard method graph using provo
- general package maintainance
- refactor modules
- add utility to read qudt-mapping from unit-symbols
- add utility to generate subgraph for quantity description using qudt
- support json files as input for mapping files
- drop unneeded intermediate file outputs
- update excel parser
- add json parser
- update documentation
I see it's still marked as draft so I'm waiting with my review for now. |
data2rdf/models/mapping.py
Outdated
) | ||
|
||
@property | ||
def suffix(cls) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't self
instead of cls
be used for the property annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. I don't know how I came to cls
.
iri: AnyUrl = Field( | ||
..., description="Ontological class related to this concept" | ||
) | ||
config: Config = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it mean that the config object will be linked to each mapping entry? If so, that seems quite redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the mapping models in this module should be treated as standalone in order to be independently reused in different parsers, especially the QuantityMappings
.
The Config
object here is inherited to all child classes of the BasicConceptMapping
since all of them are depending on it for building the subgraph. In order to ensure that the same config is used everywhere in the overall package, the same config object is later passed from the pipeline to the parser and from the parser to the mapping.
Does this make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered these approaches?
- Have an upper class that contains the config and the entries (i.e., the
ConceptMapping
instances) as fields. - Put the config instance in the
__init__.py
so that it accessible to all the instances.
I do think your approach is valid, but was just wondering about the alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overwriting the __init__
for a pydantic BaseModel
is rather something which should be avoided and only be done if really necessary, like we do e.g in the DSMS SDK.
The default_factory
in this model here is making an Config
-instance for your model if not provided by the child classes/ by the user. This is generlly a better practise in pydantic.
This actually does the same job as instanciating at __init__
if not provided as kwarg
by the child classes.
Is this what you tried to explain as alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be removed now that we decided we don't want to support chowlk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, still have to do this.
from data2rdf import BasicConceptMapping | ||
|
||
|
||
class AnnotationPipeline(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming this class name to Data2Rdf
to make it more straightforward?
"""Return dict of json-ld for graph""" | ||
return { | ||
"@context": { | ||
"fileid": make_prefix(cls.config), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileid
is somewhat confusing. Can a better name be found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like base_uri
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something that is more self-explanatory. How about simply data
?
default_factory=Config, description="Configuration object" | ||
) | ||
|
||
extra_triples: Optional[Union[str, Graph]] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly additional
sounds better in English than extra
.
examples/pipeline_demo.ipynb
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding at the beginning of this file a text block with a table of contents and a short explanation will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The v6
could be removed from the filename.
None, description="Symbol or QUDT IRI for the mapping" | ||
) | ||
annotation: Optional[Union[str, AnyUrl]] = Field( | ||
None, description="Base IRI with which the value shall be concatenated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use concatenation and not simply provide the full IRI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because we want to concatenate with the value of the respective concept.
e.g.
value ="DP800"
annotation = "https://w3id.org/steel/ProcessOntology/"
-> iri = https://w3id.org/steel/ProcessOntology/DP800
…eries, both, or none of them exists
… and metadata optional
…atatype, add code coverage