Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reference a Short doi? #120

Closed
adamml opened this issue Oct 29, 2020 · 4 comments · Fixed by #121
Closed

How to reference a Short doi? #120

adamml opened this issue Oct 29, 2020 · 4 comments · Fixed by #121
Assignees
Labels
accepted decision Issues on which a decision was accepted for release. enhancement New feature or request Update Documentation updates to the guidance docs
Milestone

Comments

@adamml
Copy link

adamml commented Oct 29, 2020

Hi

At the Marine Institute, Ireland we are looking to update our Schema.org offering for datasets to follow the ESIP guidelines here.

One thing I haven't seen mentioned in the guidelines relates in general to identifiers, and in particular to Short dois.

We always create the Short doi at the time of doi creation for a dataset and tag the datasets with both.

How would the Science on Schema group go about modelling this? I can see two options (I think!):

  1. Use a pattern like (note the sameAs):
{
    ...
    "identifier": {
	"@id": "https://doi.org/10.20393/f0257b6e-c55e-4aff-9a68-1c755e3dd8ed",
	"@type": "PropertyValue",
	"propertyID": "https://registry.identifiers.org/registry/doi",
	"value": "doi:10.20393/f0257b6e-c55e-4aff-9a68-1c755e3dd8ed",
	"url": "https://doi.org/10.20393/f0257b6e-c55e-4aff-9a68-1c755e3dd8ed"
        "sameAs": "http://doi.org/csgf"
    }
}
  1. Repeat the entire identifier block, but for the short DOI.

I'd be interested in your thoughts on this.

Thanks!

Adam Leadbetter

@ashepherd ashepherd added enhancement New feature or request help wanted Extra attention is needed Update Documentation updates to the guidance docs labels Oct 29, 2020
@ashepherd ashepherd added this to the v1.2 milestone Oct 29, 2020
@ashepherd
Copy link
Member

ashepherd commented Oct 29, 2020

I think using sameAs: "http://doi.org/csgf" is a nice way to go without needing to add another resource to the graph (adding weight to the page w.r.t Google's KB harvest limit). If adding another resource, would we want some equivalency statement b/w the two as well? sameAs seems elegant.

And if harvesters traverse the sameAs property, they can retrieve schema.org for this short DOI URL:

curl --location --request GET 'http://doi.org/csgf' --header 'Accept: application/ld+json'

{
    "@context": "http://schema.org",
    "@type": "Dataset",
    "@id": "https://doi.org/10.20393/f0257b6e-c55e-4aff-9a68-1c755e3dd8ed",
    "url": "http://data.marine.ie/geonetwork/srv/eng/catalog.search#/metadata/ie.marine.data:dataset.2740",
    "additionalType": "Dataset",
    "name": "Midnight profiles of chlorophyll fluorescence data from Lough Furnace, 2009-2014 and associated phytoplankton and descriptive data",
    "author": [
        {
            "name": "Elvira de Eyto",
            "givenName": "Elvira",
            "familyName": "de Eyto",
            "@type": "Person"
        },
        {
            "name": "Mary Dillane",
            "givenName": "Mary",
            "familyName": "Dillane",
            "@type": "Person"
        },
        {
            "name": "Joseph Cooney",
            "givenName": "Joseph",
            "familyName": "Cooney",
            "@type": "Person"
        },
        {
            "name": "Pat Hughes",
            "givenName": "Pat",
            "familyName": "Hughes",
            "@type": "Person"
        },
        {
            "name": "Michael Murphy",
            "givenName": "Michael",
            "familyName": "Murphy",
            "@type": "Person"
        },
        {
            "name": "Pat Nixon",
            "givenName": "Pat",
            "familyName": "Nixon",
            "@type": "Person"
        },
        {
            "name": "David Sweeney",
            "givenName": "David",
            "familyName": "Sweeney",
            "@type": "Person"
        },
        {
            "name": "Russell Poole",
            "givenName": "Russell",
            "familyName": "Poole",
            "@type": "Person"
        },
        {
            "name": "Martin Rouen",
            "givenName": "Martin",
            "familyName": "Rouen",
            "@type": "Person"
        },
        {
            "name": "Elizabeth Ryder",
            "givenName": "Elizabeth",
            "familyName": "Ryder",
            "@type": "Person"
        },
        {
            "name": "Sile Daly",
            "givenName": "Sile",
            "familyName": "Daly",
            "@type": "Person"
        },
        {
            "name": "Donncha O'Cathain",
            "givenName": "Donncha",
            "familyName": "O'Cathain",
            "@type": "Person"
        },
        {
            "name": "Lorraine Archer",
            "givenName": "Lorraine",
            "familyName": "Archer",
            "@type": "Person"
        }
    ],
    "description": "Chlorophyll fluorescence (ChlF) is measured routinely in Lough Furnace as part of an ongoing LTER (long term ecological research) program of monitoring. Furnace is a coastal lagoon in the Burrishoole catchment, with a permanently moored AWQMS (automatic water quality monitoring station). ChlF is considered to be a good proxy measurement of phytoplankton biomass. This dataset comprises midnight profiles of ChlF, water temperature, dissolved oxygen concentration and saturation, pH, conductivity and salinity (1. Midnight profiles.csv). The dataset also includes the biomass of phytoplankton groups estimated from spot samples (2. Phytoplankton biovolumes.csv) along with hydrological and meteorological variables describing the environmental conditions over the time period (3. Descriptive data. csv).",
    "version": "1",
    "keywords": "Inland waters, Oceans",
    "inLanguage": "en",
    "encodingFormat": "text/csv",
    "datePublished": "2018",
    "publisher": {
        "@type": "Organization",
        "name": "Marine Institute, Ireland"
    },
    "provider": {
        "@type": "Organization",
        "name": "datacite"
    }
}

@adamml
Copy link
Author

adamml commented Oct 30, 2020

@ashepherd Thanks for the feedback. I definitely prefer the sameAs approach as I think the repetition of the identifier block is overkill for representing what is essentially an alias to the "full" doi. I hadn't appreciated either:

  1. That there's a limit on the harvest entities (but that kind of makes sense)
  2. That the sameAs would allow for a related harvest across from DataCite - which I think is really neat and ties the whole Knowledge Graph together really well.

So, in summary - I'm supportive of the sameAs approach!

@ashepherd
Copy link
Member

ashepherd commented Oct 30, 2020

sounds good, so I think the action item for us here is to incorporate the shortDOI in the 1) guidelines and 2) in the example JSON-LD files. Thanks @adamml, we'll close this shortly, and get in the pipeline for v1.2 to be released in January

@ashepherd ashepherd self-assigned this Oct 30, 2020
@ashepherd ashepherd removed the help wanted Extra attention is needed label Oct 30, 2020
@mbjones mbjones linked a pull request Jan 22, 2021 that will close this issue
@mbjones mbjones added the accepted decision Issues on which a decision was accepted for release. label Jan 22, 2021
@mbjones
Copy link
Collaborator

mbjones commented Jan 27, 2021

PR #121 was merged and so this short-DOI guidance will be included in the 1.2 release, so closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted decision Issues on which a decision was accepted for release. enhancement New feature or request Update Documentation updates to the guidance docs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants