-
Notifications
You must be signed in to change notification settings - Fork 1
Meeting Minutes & Thoughts | April 26 2013
Let's go over progress this week:
Tuesday Class: I should have taken a lot more initiative in responding to emails and moving forward with the fresh water data. I put too much time into the use case documents. I stated my current questions and where I wanted to go next. We agreed upon the next steps and I drafted a email of questions and such to Jeremy et. al
Wed: Exchanged emails with Winkler and Jeremy and made lots of good progress on understanding the data we have at hand.
Wed Meeting w/ Patrice: We went over what we had in front of us. What is still unknown. What is still holding us back. Where do we go now that we know what we know. We went over the data files column by column and if we didn't know everything we thought we should know about the data in the column (units, what the column even refers to, what the values or absence of the value means, etc) we noted it. It is now my job to try and learn about this missing data. First on my own, but what I cannot find in the related materials I should field to Jeremy and Winkler again.
We consolidated all the relevant materials of the project on the Github. Data and papers and such. The USGS data is missing (the .nt files) since it is quite large and I have slow internet. I will try to upload the compressed 500mb file of the data soon.
Write out what I understand based off the answers. Anything missing, formulate questions and field to Jeremy et al.
-
- Hydrologic Type
-
- DEC code?
-
- What does a yellow cell mean, since we do know what a red cell means (I think)
-
- Watershed and Ratio stuff
-
- Anything else that we don't know
-
- Where exactly are they pulling these DEC codes, I cannot find their access point
-
Missing Columns (purple ones) are still unknown. Watershed, Ratio, Hydrologic Type.
-
- See if the referenced PDF (the readable one) has anything on this or do some basic research. How does a lake relate to a watershed?
-
- This sort of feature information and the connection of multiple lakes to a watershed may be defined in the GeoSpatial data that is contained in the NHD data we were sent. See if it is in there. Can we use it? If you need help, ask the contact from the USGS or email the USGS directly. Maybe request new data on just the watersheds if I can find it in their viewer tool.
-
- Upload the Zip of the NHD data to Github when I get a chance.
End of meeting thoughts Near the end of the meeting (overlapped with Katies meeting a little) we began to discuss, now that the data is mostly understandable, how to integrate this data in the semantEco world and how can we use it?
Let me add the email Patrice sent to me recently:
... what is your current understanding of the data based on answered questions, what are the next set of questions you plan on posing to further clarify their study and data, and how this all fits into (geo) semantic search by 1) converting the data into RDF based on the more explicit representations, 2) linking the organism and geospatial feature classes to ontologies 3) capturing the geometry information of the lakes and watersheds.
Actually given your current understanding of Darrin data, you could also include more specific versions of the geospatial measurement queries on water and species measurement than what you described in the use case you submitted last week. You can use this description also to update your use case, which is one of the primary products of your project.
This does a good job of breaking down my next steps for the project. Thanks to reviewing the paper and stepping through the data with Patrice I have a good understanding of the data at hand (Things missing highlighted above in the meeting notes). The next step is to draft more questions to Jeremy and Winkler and run it by Deborah and Patrice first. This is planned to happen tomorrow (sat).
Now the end-of-meeting integration I mentioned above
1) converting the data into RDF based on the more explicit representations
I have DOC codes for these lakes. I have depth, size, name, associated water table, etc. for these lakes. Can I connect to an existing data-set on these lakes and use the metadata supplied by a existing ontology to be able to properly reference the lake. I can extend the ontology to add any missing data that I may want to store (like the zmax or something).
2) linking the organism and geospatial feature classes to ontologies
We played with Bio-portal and there already exists a lot of information on the various organisms that were sampled. We can link to this, and add in our own data where things are missing. We don't have much on the organisms besides their name and their ppm in the samples, where the lakes we have a lot more information to store on them. We want to leverage the existing ontologies as mush as possible.
Let's consider the geospatial data now. Where can we leverage existing information. This is the heart of the USGS data. I have some data, and I am trying to see what is contained in to and how I can use it. I am playing with querying it to see what information I can use, but from the meeting it was decided I should make an effort to contact our USGS contact (the one who sent us the data) and ask him directly how I can properly use the data.
3) capturing the geometry information of the lakes and watersheds.
It is true. The geometry of the lakes and watersheds is not in the data we currently have. It may be in the USGS data. I need to see how I can not only capture their polygons (existing data, or do it myself), but also how I can relate lakes to watersheds so we can apply reasoning to queries. The example we used in our meeting was if two different lakes share the same watershed, then requesting information on one lake, we could suggest expanding the selection to include the other connected lake. This is still early, mostly because I don't know much about watersheds and I currently don't know where to get watershed data. I will check the USGS webviewer and see if there is something there. Else, not sure.
Brother visit
My brother visited me Wed. after my meeting and Thursday, he left this morning. He is going to Iraq this summer so he won't be here for my graduation. Thus, I didn't see Patrice's email until this morning.
Didn't get to
We didn't get to working on the coding of a semantAqua plugin with Evan and Katie since Evan was hard to find and I needed to go see my brother.
Tl;Dr:
Emails sometime tomorrow with an effort to make more progress by next class.