-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ScoreVariant to support exact PGS Catalog standard #46
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #46 +/- ##
==========================================
+ Coverage 87.98% 90.44% +2.45%
==========================================
Files 20 42 +22
Lines 1049 2596 +1547
==========================================
+ Hits 923 2348 +1425
- Misses 126 248 +122
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Re |
With regard to |
I thought the solution was to always read effect weight and position columns as str and then check that they can be coerced into a float? |
Also missing the version bump?
|
This reverts commit 0d5c6e5.
Background
We had some problems including:
New approach
check_effect_weight
)Gradual typing
Pydantic uses type hints to build data models, so
pgscatalog.core
now has type hints checked (if they exist) with mypy.Test notes
Running the entire Catalog:
PGS002253
contains both effect weights and dosage weight columns, so I removed the restriction of exclusive weight types (if the effect weight column is present, we'll use it)PGS002263
contains a peculiar variant on row 223ValidationError
because there's no positional informationSubmission validation
CatalogScoreVariant
would be a good place to implement validation logic usingfield_validator
ormodel_validator
so we have unified data models across submission validation and calculator normalisation.Outstanding questions:
effect weights: treat them as strings that are castable to numeric, normal floats, or decimals with a defined precision? (thinking of plink limits)(be consistent internally using strings that can be coerced to floats)Closes PGScatalog/pgsc_calc#370