Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1pt] PR: nws_lid creation (Dev-nws-lid-creation) #372

Merged
merged 82 commits into from
May 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
339001d
converting catfim pipeline to open source
Mar 3, 2021
d980c1b
Merge branch 'dev' into dev-catfim-workflow
Mar 3, 2021
efdf609
updating aggregate grid blocksize
Mar 3, 2021
3fff6c2
parallelizing aggregation process
Mar 3, 2021
19e364e
cleanup
Mar 3, 2021
3814ecb
Merge branch 'dev' into dev-catfim-workflow
Mar 3, 2021
8ae9021
Merge branch 'dev-catfim-workflow' into dev-agg-blocksize
Mar 3, 2021
1743351
updated comment in generate_categorical_fim.py
Mar 3, 2021
61a866c
reprojecting rasters to Web Mercator
Mar 3, 2021
5c108e2
adding jobs to fim_run.sh
Mar 3, 2021
b57dff6
removing multiple util folders
Mar 4, 2021
9c45d3a
merging with remote branch
Mar 4, 2021
cf867df
Merge branch 'dev-catfim-workflow' into dev-agg-blocksize
Mar 4, 2021
f74f7d2
removing comment in inundation_wrapper_custom_flow.py
Mar 4, 2021
6197f32
removing comment in inundation_wrapper_nwm_flow.py
Mar 4, 2021
dd8952c
formatting eval_plots.py
Mar 4, 2021
1bdcdb4
Merge branch 'dev-catfim-workflow' into dev-agg-blocksize
Mar 4, 2021
a7f7e2c
adding usgs pixel catchment ID crosswalk
Mar 5, 2021
2cbe061
adding dem value samples
Mar 8, 2021
816c1b7
refactoring tables and adding evelation values
Mar 9, 2021
b91ba04
adding tables to prod whitelist
Mar 9, 2021
7708f3b
moving usgs gage shp to inputs
Mar 9, 2021
957c037
fixed var name
Mar 10, 2021
8e36fab
handles no nearby hydroids
Mar 10, 2021
6d42f61
rounding elevation values
Mar 11, 2021
76d4aa2
formatting
Mar 11, 2021
430a0e7
temporary patch for BED run
Mar 12, 2021
bc59ad2
Merge branch 'dev' into dev-agg-patch
Mar 15, 2021
7ea4e44
merging with dev and increasing agg jobs from 4 to 6
Mar 15, 2021
1bb7020
adding post-processing script to gather elevation values and calculat…
Mar 25, 2021
51fd5ed
merging with dev
Mar 25, 2021
0ff56b5
fixing merge conflict
Mar 25, 2021
b50f0fb
Merge branch 'dev' into dev-agg-patch
Mar 25, 2021
2064583
Merge branch 'dev-agg-patch' into dev-usgs-crosswalk
Mar 25, 2021
eb670e0
updating args and renaming crosswalk
Mar 25, 2021
be7543c
fixing bug in rem.py
Mar 25, 2021
aab613b
str_order object issue - still not resolved
Mar 26, 2021
d139c1d
switching to numpy to get str_order
Mar 26, 2021
5e07adb
partial update of stats function using slice arg (no VPN right now)
Mar 26, 2021
4832246
adding group arg for stat grouping
Mar 26, 2021
1fb39fa
saving final agg stats
Mar 26, 2021
d667c4f
tidy up for PR
Mar 29, 2021
f76f977
adding back tools/generate_categorical_fim.py - thought that was an o…
Mar 29, 2021
090339a
addressing comments in PR review
Mar 30, 2021
437db29
addressing comments in PR review
Mar 31, 2021
1f33dc9
removing comments
Mar 31, 2021
71816d2
Merge branch 'dev' into dev-usgs-crosswalk
Mar 31, 2021
92f315f
commenting out local headwater; refactoring pre-processing
Mar 31, 2021
7169c71
Merge branch 'dev-usgs-crosswalk' into dev-local-headwaters
Mar 31, 2021
7bce663
check_dem_data scratch file
Apr 13, 2021
cdde243
merging dev and resolving conflicts
Apr 19, 2021
e780468
cleaning up scratch code
Apr 19, 2021
4f21b3f
converting to env variables
Apr 21, 2021
780b09c
consolidating fr and ms input layers
Apr 23, 2021
def4944
handling case where two nws lids exist near a single stream segment
Apr 25, 2021
5260721
Merge branch 'dev' into dev-local-headwaters
Apr 25, 2021
267136e
fixing issue with sythesize_test_case.py parallelization
Apr 26, 2021
1d0cf00
fixing bug where synthesize_test_case.py gets hung up in multiprocessing
Apr 27, 2021
968955f
Merge branch 'dev' into dev-eval-hotfix
Apr 27, 2021
6f11126
removing incoming segments to wbd buffer boundary so they will not be…
Apr 27, 2021
b748ca1
fixing indentation
Apr 27, 2021
146235e
Merge branch 'dev' into dev-local-headwaters
Apr 28, 2021
9400ef0
Merge branch 'dev-eval-hotfix' into dev-local-headwaters
Apr 28, 2021
12ef27f
Update CHANGELOG.md
Apr 28, 2021
18d0822
Update CHANGELOG.md
Apr 28, 2021
554efc5
Update CHANGELOG.md
Apr 28, 2021
558fabc
Merge branch 'dev-eval-hotfix' into dev-local-headwaters
Apr 28, 2021
8526f9d
Merge branch 'dev' into dev-local-headwaters
Apr 28, 2021
958a0d9
Update CHANGELOG.md
Apr 28, 2021
c80d4ed
Identify headwater points and co-located points (on same feature id)
TrevorGrout-NOAA Apr 28, 2021
4ac1a47
finalize script to generate nws_lid geopackage layer
TrevorGrout-NOAA Apr 30, 2021
2ef3bf0
rename script to generate_nws_lid.py, and add driver details to save …
TrevorGrout-NOAA Apr 30, 2021
6dcbc61
edit command line arguments
TrevorGrout-NOAA Apr 30, 2021
359c2b1
adding print statements
TrevorGrout-NOAA Apr 30, 2021
5860549
Create output directory prior to saving geopackage
TrevorGrout-NOAA Apr 30, 2021
9b6687a
modify print statements
TrevorGrout-NOAA Apr 30, 2021
0bc2838
Merge branch 'dev' into dev-nws-lid-creation
TrevorGrout-NOAA Apr 30, 2021
96fd1b3
Merge branch 'dev-local-headwaters' into dev-nws-lid-creation
TrevorGrout-NOAA Apr 30, 2021
ea56a67
Update CHANGELOG.md
TrevorGrout-NOAA Apr 30, 2021
eefe775
Merge branch 'dev' into dev-nws-lid-creation
May 3, 2021
0863df9
Update CHANGELOG.md
BradfordBates-NOAA May 3, 2021
015dd89
Update CHANGELOG.md
BradfordBates-NOAA May 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
All notable changes to this project will be documented in this file.
We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.
<br/><br/>
## v3.0.15.9 - 2021-05-03 - [PR #372](https://github.com/NOAA-OWP/cahaba/pull/372)

Generate `nws_lid.gpkg`.

## Additions
- Generate `nws_lid.gpkg` with attributes indicating if site is a headwater `nws_lid` as well as if it is co-located with another `nws_lid` which is referenced to the same `nwm_feature_id` segment.

<br/><br/>
## v3.0.15.8 - 2021-04-29 - [PR #371](https://github.com/NOAA-OWP/cahaba/pull/371)

Expand Down
180 changes: 180 additions & 0 deletions tools/generate_nws_lid.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
#!/usr/bin/env python3

from pathlib import Path
import pandas as pd
import geopandas as gpd
from collections import defaultdict
from tools_shared_functions import aggregate_wbd_hucs, get_metadata
import argparse
from dotenv import load_dotenv
import os
import sys
sys.path.append('/foss_fim/src')
from utils.shared_variables import PREP_PROJECTION


load_dotenv()
#import variables from .env file
API_BASE_URL = os.getenv("API_BASE_URL")
EVALUATED_SITES_CSV = os.getenv("EVALUATED_SITES_CSV")
WBD_LAYER = os.getenv("WBD_LAYER")
#Define path to NWM stream layer
NWM_FILE='/data/inputs/nwm_hydrofabric/nwm_flows.gpkg'


def generate_nws_lid(workspace):
'''
Generate the nws_lid layer containing all nws_lid points attributed whether site is mainstems and co-located

Parameters
----------
workspace : STR
Directory where outputs will be saved.

Returns
-------
None.

'''


##############################################################################
#Get all nws_lid points
print('Retrieving metadata ..')

metadata_url = f'{API_BASE_URL}/metadata/'
#Trace downstream from all rfc_forecast_point.
select_by = 'nws_lid'
selector = ['all']
must_include = 'nws_data.rfc_forecast_point'
downstream_trace_distance = 'all'
fcst_list, fcst_dataframe = get_metadata(metadata_url = metadata_url, select_by = select_by, selector = selector, must_include = must_include, upstream_trace_distance = None, downstream_trace_distance = downstream_trace_distance )

#Get list of all evaluated sites not in fcst_list
fcst_list_sites = [record.get('identifiers').get('nws_lid').lower() for record in fcst_list]
evaluated_sites = pd.read_csv(EVALUATED_SITES_CSV)['Total_List'].str.lower().to_list()
evaluated_sites= list(set(evaluated_sites) - set(fcst_list_sites))

#Trace downstream from all evaluated sites not in fcst_list
select_by = 'nws_lid'
selector = evaluated_sites
must_include = None
downstream_trace_distance = 'all'
eval_list, eval_dataframe = get_metadata(metadata_url = metadata_url, select_by = select_by, selector = selector, must_include = must_include, upstream_trace_distance = None, downstream_trace_distance = downstream_trace_distance )

#Trace downstream from all sites in HI/PR.
select_by = 'state'
selector = ['HI','PR']
must_include = None
downstream_trace_distance = 'all'
islands_list, islands_dataframe = get_metadata(metadata_url = metadata_url, select_by = select_by, selector = selector, must_include = must_include, upstream_trace_distance = None, downstream_trace_distance = downstream_trace_distance )

#Append all lists
all_lists = fcst_list + eval_list + islands_list

###############################################################################
#Compile NWM segments from all_lists

#Get dictionary of downstream segment (key) and target segments (values)
#Get dictionary of target segment (key) and site code (value)
downstream = defaultdict(list)
target = defaultdict(list)
#For each lid metadata dictionary in list
for lid in all_lists:
site = lid.get('identifiers').get('nws_lid')
#Get the nwm feature id associated with the location
location_nwm_seg = lid.get('identifiers').get('nwm_feature_id')
#get all downstream segments
downstream_nwm_segs = lid.get('downstream_nwm_features')
#If valid location_nwm_segs construct two dictionaries.
if location_nwm_seg:
#Dictionary with target segment and site
target[int(location_nwm_seg)].append(site)
#Dictionary of key (2nd to last element) and value (target segment)
#2nd to last element used because last element is always 0 (ocean) and the 2nd to last allows for us to get the river 'tree' (Mississippi, Colorado, etc)
value = location_nwm_seg
if not downstream_nwm_segs:
#Special case, no downstream nwm segments are returned (PR/VI/HI).
key = location_nwm_seg
elif len(downstream_nwm_segs) == 1:
#Special case, the nws_lid is within 1 segment of the ocean (0)
key = location_nwm_seg
elif len(downstream_nwm_segs)>1:
#Otherwise, 2nd to last element used to identify proper river system.
key = downstream_nwm_segs[-2]
#Dictionary with key of 2nd to last downstream segment and value of site nwm segment
downstream[int(key)].append(int(value))
###############################################################################
#Walk downstream the network and identify headwater points
print('Traversing network..')

#Import NWM file and create dictionary of network and create the NWM network dictionary.
nwm_gdf = gpd.read_file(NWM_FILE)
network = nwm_gdf.groupby('ID')['to'].apply(list).to_dict()

#Walk through network and find headwater points
all_dicts = {}
for tree, targets in downstream.items():
#All targets are assigned headwaters
sub_dict = {i:'is_headwater' for i in targets}
#Walk downstream of each target
for i in targets:
#Check to see element is not a headwater
if sub_dict[i] == 'not_headwater':
continue
#Get from_node and to_node.
from_node = i
[to_node] = network[from_node]
#Walk downstream from target
while to_node>0:
[to_node] = network[to_node]
#Check if to_node is in targets list
if to_node in targets:
sub_dict[to_node] = 'not_headwater'
#Append status to master dictionary
all_dicts.update(sub_dict)

#Create dictionaries of nws_lid (key) and headwater status (value) and nws_lid (key) and co-located with same feature_id(value)
final_dict = {}
duplicate_dict = {}
for key,status in all_dicts.items():
site_list = target[key]
for site in site_list:
final_dict[site] = status
if len(site_list) > 1:
duplicate_dict[site] = 'is_colocated'
else:
duplicate_dict[site] = 'not_colocated'

##############################################################################
#Get Spatial data and populate headwater/duplicate attributes
print('Attributing nws_lid layer..')

#Geodataframe from all_lists, reproject, and reset index.
trash, nws_lid_gdf = aggregate_wbd_hucs(all_lists, WBD_LAYER, retain_attributes = False)
nws_lid_gdf.columns = [name.replace('identifiers_','') for name in nws_lid_gdf.columns]
nws_lid_gdf.to_crs(PREP_PROJECTION, inplace = True)
nws_lid_gdf.reset_index(drop = True)

#Create DataFrames of headwater and duplicates and join.
final_dict_pd = pd.DataFrame(list(final_dict.items()), columns = ['nws_lid','is_headwater'])
duplicate_dict_pd = pd.DataFrame(list(duplicate_dict.items()),columns = ['nws_lid','is_colocated'])
attributes = final_dict_pd.merge(duplicate_dict_pd, on = 'nws_lid')
attributes.replace({'is_headwater': True,'is_colocated': True,'not_headwater': False,'not_colocated':False}, inplace = True)

#Join attributes, remove sites with no assigned nwm_feature_id and write to file
joined = nws_lid_gdf.merge(attributes, on='nws_lid', how = 'left')
joined.dropna(subset =['nwm_feature_id'], inplace = True)
Path(workspace).mkdir(parents = True, exist_ok = True)
joined.to_file(Path(workspace) / 'nws_lid.gpkg', driver = 'GPKG')



if __name__ == '__main__':
#Parse arguments
parser = argparse.ArgumentParser(description = 'Create spatial data of nws_lid points attributed with mainstems and colocated.')
parser.add_argument('-w', '--workspace', help = 'Workspace where all data will be stored.', required = True)
args = vars(parser.parse_args())

#Run get_env_paths and static_flow_lids
generate_nws_lid(**args)