-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exact authorship match with multiple authorship #126
Comments
We use authorship matching for sorting of returned results. So For example here we see only 'best' results for each name, and they look as expected: It also manages to figure out 'best' result if authors are abbreviated: I was thinking to make another 'all-matches' option that would return best match for each data-source, but was not sure if it would be confusing, or helpful. |
Got it! Here in my example with I guess that for some taxons I would want all the matches and for others not, but I can't know in advance (querying for a lot of different taxons)... |
would it work for you to pick the first result from the returned list, if you do want to keep all-matches for all your queries? The first result is guaranteed to correspond to the 'best' match for each data-source. Results are not sorted by data-sources, only by the the quality algorithm, but the first result for each data-source is always the 'best' result for that data-source. |
may be I do need to add a flag 'best-by-data-source' or something of this sort? |
The thing is we always want to keep all the results for all the data source in the query, to have the synonyms, or if the taxon we queried for is written fuzzy etc., unless the result is simply "wrong" like in my example (Agrostis tenuis Sibthorp matching to Agrostis idahoensis Nash is wrong based on VASCAN). But we can't know before querying that the taxon may have a "wrong" match and that we should use only the best match. |
I can imagine 2 things that might help:
|
Yes but we want the matches for all the datasources queried, even for matches with an authorship. I think we'll manage to find a way to work around this on our side after the query, using the authorship and another field that we have ( I think we can close the issue. |
I am not sure if this is normal behaviour or not and so I just want to confirm.
The plant
Agrostis tenuis
isn't valid as per VASCAN database, and links to two different species based on the authorship that describedAgrostis tenuis
(https://data.canadensys.net/vascan/name/Agrostis%20tenuis):Agrostis tenuis Sibthorp
links toAgrostis capillaris Linnaeus
and
Agrostis tenuis Vasey
links toAgrostis idahoensis Nash
When we query specifially for
Agrostis tenuis Sibthorp
fromVASCAN
:https://verifier.globalnames.org/api/v1/verifications/Agrostis+tenuis+Sibthorp?capitalize=true&all_matches=true&data_sources=147
we get two results, one for
Agrostis capillaris
and another forAgrostis idahoensis
.I would have expected only one result since the authorship is provided and is supposed to only link to
Agrostis capillaris Linnaeus
.Is this normal behaviour?
The text was updated successfully, but these errors were encountered: