Fields to be imported? #3

bcamper · 2014-10-08T19:13:46Z

Hi, could you let us know which of the fields in the original dataset you plan to import, and what the corresponding proposed OSM tags will be? Apologies if I missed this somewhere!

almccon · 2014-10-08T19:55:49Z

We haven't gotten that far yet! Would love some help figuring that out.

almccon · 2014-12-12T01:38:39Z

The building data looks like this:

Official documentation for the building outlines: http://egis3.lacounty.gov/dataportal/2011/04/28/countywide-building-outlines/

Of those attributes, I propose we keep the height, and discard the rest. Source and date (LARIAC, 2008) will be added to the changesets.

The address data looks like this:

Official documentation for the address points is here: http://egis3.lacounty.gov/dataportal/2012/06/19/la-county-address-points

I'm not an expert about OSM's addressing scheme, so any input on this part would be useful. We will need to make extensive changes to the convert.py script (I just uploaded the script from the NYC import repo)

bcamper · 2014-12-12T01:42:59Z

Amazing timing, I just remembered about and showed this issue to @meetar this afternoon. We'll take a look and chime in with any thoughts on buildings, and @missinglink or @sevko might have some input on addressing from their work on Pelias.

sevko · 2014-12-12T16:35:11Z

@almccon , the dataset should map nicely onto the addr:* tags in OSM (see the wiki). There are quite a few of them, but we mostly just use the following to filter incoming data in our addresses imports:

addr:housename
addr:housenumber
addr:street
addr:city
addr:district
addr:postcode
addr:country

The LA dataset looks pretty granular, so you might want to investigate the more esoteric tags (UnitType/UnitName might be used for addr:unit, for instance), and aggregate some of the others to fit one field (Numprefix, Number, and Numsuffix to addr:number). Note that individual buildings may have multiple addresses, which perhaps @bcamper knows more about?

almccon · 2014-12-16T06:23:12Z

Here's a map of the building heights.

Looks like Pasadena and Glendale are missing height data (the red areas in the north). I propose we ignore any height tags under 1 foot.

Also, should we convert the data from feet to meters, or leave it as-is (with appropriate units)?

bcamper · 2014-12-16T14:19:31Z

According to the wiki, the height tag should be in meters (https://wiki.openstreetmap.org/wiki/Key:building). I have seen some height data with units appended, but meters are more compatible and easier to work with (NYC building data is in meters).

Makes sense to ignore height < 1 and just leave off the height tag for those.

bcamper · 2014-12-16T14:40:41Z

Regarding building fields, it may be worth preserving an id field (BLD_ID?) to tie back to the original data. NYC import did something similar in the nycdoitt:bin field: http://www.openstreetmap.org/way/265301868

I also suggest preserving the elevation in an ele tag: https://wiki.openstreetmap.org/wiki/Key:ele. It isn't as common a use case as building height, but the import is really the time to do it if we are ever going to have this data readily available, and it would be useful for terrain + 3d modeling (we would use it in Tangram).

almccon · 2014-12-16T19:33:26Z

Regarding the building id, there's both BLD_ID and AIN. I'm not sure which one is more stable, or what they're used for. I know that the NYC import was a special case in that the city had a clear plan about how they would update their own data following changes in OSM. If LA County doesn't have such a plan, then building IDs may not be as useful. Generally there seems to be a consensus forming in OSM that IDs are not very useful in imported data. But we can finalize that decision after we've consulted with the imports committee.

For now, we should find out whether BLD_ID and/or AIN will be the same in the 2014 data. I'm tracking that possibility in issue #6. Can someone who knows more about the 2014 data (@cityhubla?) find out if which IDs will be the same?

cityhubla · 2014-12-16T19:55:16Z

The AIN should be the same, these are the legal numbers of the parcels and anything pertaining thereof, which includes the buildings. These numbers change rarely if only the owner of the property decides to the split their property into pieces (subdivide). So the data should remain the same. The parcel data, contains attributes to the type of building it is, (single story, mixed use, commercial, retail, church, school, theater, parking lots, , government buildings, etc) to which we could add. I've emailed UCLA and County for their data use disclaimer regarding issue #2

I'll shoot an email to the County regarding their plan, if the BLD_ID is kept consistent with the not yet released 2014 data.

almccon · 2014-12-28T04:51:03Z

Okay, check cdd4342 for my first pass at the conversion code. I know for sure I'm not catching all of the fields that I should be. I'd love someone else to take a look at it. I'm happy with how the address points are joining with the buildings (where possible).

If you use my chunks_venice.zip sample files, and run merge.py and convert.py you'll get some .osm files to play with.

cityhubla · 2015-01-26T19:50:10Z

Regarding the Unique building IDs, the NYC import has them included, it may be best to add them as well. I could contact LA county to see if the 2014 LARIAC bldg outlines (non-public) are completely new IDs and different from the 2008 public set we're using. I'm anticipating the new set to have outlines with recently constructed buildings replacing existing ones, we could then automate this in the future by removing those IDs that are no longer there with the new ones. I'm also wondering if there is a way to retag the removed outlines for archiving.

cityhubla · 2015-01-26T20:19:36Z

Also, the data behind the elevations are off at a number of buildings, this 3D demo I made of Hollywood ThreejsQGIS demo shows the discrepancy. You will see some buildings elevated way off from a normally sloping area. At the office I work at, we use the outlines to build our 3D contextual site models for conceptual architectural work. I usually have to recenter the object axes to then intersect with the terrain in software like Rhino3d or Sketchup.

cityhubla · 2015-01-26T20:26:03Z

It may take quite a bit of time to average them out, if we script something where it takes a feature and determines whether it's ELEV is within the average of the closest surrounding buildings.

cityhubla · 2015-03-25T07:43:45Z

We may have some luck with the incoming LA County Open Data Initiative it seems the county assessor is actually going to release the roll data for each parcel for free, (more up to date than what I found from UCLA). Would this help in the code development? One attribute that would be very beneficial is the "building type" the roll data identifies the building as either a single family home, commercial, mixed, civic, school, church, etc. The text says that the data portal would be accessible on or after March 30th.

cityhubla · 2015-03-25T07:46:34Z

Here's an description on the use codes in that dataset

almccon · 2015-03-26T21:04:43Z

Oh, that does look quite interesting! I'd love to be able to use building=apartments and building=house instead of just building=yes.

http://wiki.openstreetmap.org/wiki/Tag:building%3Dhouse
http://wiki.openstreetmap.org/wiki/Tag:building%3Dapartments

In fact, many of those, like churches, schools, hospitals, could also be captured.

And @lyzidiamond and @joeyklee would love Single family residence with pool, even though we probably can't use that one meaningfully for the import.

almccon · 2015-08-23T01:44:03Z

http://wiki.openstreetmap.org/wiki/Tag:building%3Dgarage would also be great for all those garages behind single-family homes

almccon · 2015-08-23T03:18:05Z

Just had a thought: does the assessor data contain information about the number of floors in the building? If so, we can populate the building:levels tag as well as the height which comes from the building footprint data.

joeyklee · 2015-08-23T04:48:13Z

For the case of LA, I don't think it is included (http://assessor.lacounty.gov/local-roll/), but I do think the "building:levels " tag would be a worth adding since it could be something derived via geotagged photos or google streetview imagery :)

In cases like the Chicago building footprint data, they include building:levels for some of their buildings.

almccon · 2015-08-23T04:59:19Z

Bummer. If it's not in the assessor data then we can't include it in the import. But there's nothing stopping people from adding it later based on ground-level imagery or surveying on-the-ground.

We can't use Google Street View for OSM mapping because of the license, but we can use Mapillary where available: http://www.mapillary.com/map/im/2FuBwfL320GgjY5amgiCgw/photo

joeyklee · 2015-08-23T05:01:38Z

That is no bueno. And definitely good point about the licensing. Always something to keep in mind ;)

cityhubla · 2015-08-23T05:26:59Z

@jschleuss and I made this google spreadsheet to figure out the fields and their OSM tag

Attributes to be imported, Google Spreadsheets We can use this spreadsheet to determine which values go or omitted.

There are two distinct uses that the 2014 Assessor's data (also referred to issue #15) has:

GeneralUseType (General Use of the building, ie, Residential, Commercial, Industrial, Institutional)
SpecificUseType (More specific use of the building, ie, Single Family Residence, Church, Bowling Alley, Golf Course)
SpecificUseDetail1 (Additional detail to building, ie Dance Hall, Supermarket)

Jon and I propose that the these tags could be attached during the import.

GeneralUseType || building tag
SpecificUseType || building:use tag
SpecificUseDetail1 || amenity tag

For example if a building is detached single family residence its tag is as follows

building=residential
building:use=Single Family Residence

OSM's taginfo shows residential as a general tag, we could consider the building tag as general. We think there should be a general tag for general designations like residential, commercial, industrial, with a subtag like building:use as whether a residential is either single family, duplex, triplex condominum.

The data in the assessor's is fine grain, we're open to tagging them differently. Thoughts?

cityhubla · 2015-08-23T05:34:34Z

@joeyklee @almccon,

The SpecificUseDetail2 of the 2014 Assesor's data has these values

14 to 20 Stories
6 to 13 Stories
Five Stories
Four Stories
One Story
Over 20 Stories
Three Stories
Two Stories

We could update the script to add these to the building:levels tag

joeyklee · 2015-08-23T05:36:22Z

@cityhubla Nice find! I'm sorry I missed that. It's been awhile since I revisited the data. Better two brains than one!

cityhubla · 2015-08-23T05:38:37Z

The only issue is that this category has values like Modular, pool, vacant land, fast food etc. It would have to be sorted during scripting. Those identified as 14-20 or 6-13 would need to be omitted as the tag is number based, right?

cityhubla · 2015-08-23T05:52:22Z

Here is a breakdown of all the unique values in the use categories, Google Spreadsheet

cityhubla · 2015-08-23T07:21:02Z

I'm figuring out the markdown syntax for adding tables to the readme, We could transition the gdoc to the readme or create a git_wiki to simplify what attributes we're importing

almccon · 2015-08-23T16:43:28Z

This is great! I think we can add these tables to the README in Markdown, or else put them in the wiki on osm.org. Eventually we'll have to document everything on OSM's wiki anyway. But we shouldn't use the github wiki, since that will just confuse things.

jschleuss · 2015-08-24T17:47:52Z

@cityhubla okay. I added the 2015 Assessor data to that spreadsheet. I also started thinking about the OSM tags, starting with our thoughts above. I think we could work with OSM's tags a bit with building=house or building=school. And then go generic when we don't have more information.

We could augment the script to look for both values before making a "decision" on the tags to add. Maybe we do something similar for SpecificUseType1 and 2?

jschleuss · 2015-08-24T19:48:36Z

@almccon you know the assessor's data also has address fields, right? But we're opting for the address points because some buildings will have multiple addresses? Is that right?

almccon · 2015-08-24T22:03:59Z

No, I did not know that the assessor's data has addresses. But you're correct, if the assessor only has one address per parcel then the address points will be preferable.

cityhubla · 2015-08-25T17:57:42Z

Table has been added to the readme, I'll update the osm wiki. From the looks of it, and what Jon and I talked last Sunday, our import will be really comprehensive. Lots of great data.

cityhubla · 2015-08-25T18:38:08Z

If anyone has time, the assessor's data has values in the USE columns that could be sorted into specific OSM tags, like the building:levels tag, here is a gdoc of values to see if there are some that could be sorted otherwise the script would be tagged with building:use. @jschleuss also prepared a sheet on the gdoc to rename some values to match the ones on OSM

planemad · 2015-12-30T11:22:40Z

Just went through the discussions and it looks like we can categorize the fields into two parts:

Building attributes specific to the footprint (area): If the building has been reconstructed since 2008, these are useless
Address attributes specific to the geographic location (point): These are usually valuable even if the building footprint has changed

Per this trial #18 (comment) I did notice that around 1 in 20 buildings had changed and were not good to be imported. If we tie in the address fields to the building footprints while importing, we will have to discard both the footprints and address together, but ideally we would want to import the address as a point property or add it to a newer footprint if possible.

How about we split the dataset into polygons with just the footprint and building attributes, and a point dataset of address attributes extracted from building centroids?

This will allow us to better fill data gaps in OSM rather than trying to throw all the data in at once:

Import only the footprints and building attribute that match with imagery
Update any existing OSM footprint geometry if needed
Conflate the address points to the latest building footprint on OSM if they overlap, manually inspect addresses which did not overlap

maning · 2016-01-15T12:28:43Z

I took a stab at reviewing the Assessor categories. I think it contains a lot of information that can fine tune the building tag other than what is in the GeneralUse fields. For one, OSM doesn't have Institutional and Miscellaneous.

What I did was first compare the GeneralUse and SpecificUse and categorized according to what tag exist in OSM according to Taginfo. Then, I compared SpecificUse and Specific_1 and categorized anything that were left out.

In most cases, we will override the GeneralUse in favor of SpecificUse. For example,
A feature which has GeneralUse = Commercial; SpecificUse=Department Store will be
building=department_store. Or GeneralUse=Recreational; SpecificUse=Athletic and Amusement Facility; Specific_1 = Dance Hall will be building=recreational; building:use=dance_hall.

Tags we can include are building, building:use and building:levels. I don't think adding the the amenity and shop tags is appropriate since it will be included in the smaller buildings within the propery/parcel.

Pros - we adopt common OSM convention on tagging buildings. We transalte as much info as possible from the source.
Cons - we lose that actual source attributes from the Assesor database.

Next Actions

Run a second pass of review of the spreadsheet refine tagging rules.
Modify convert.py to sanitize the tags.

@almccon @jschleuss @cityhubla @planemad @batpad

maning · 2016-01-18T12:12:20Z

PR for review here. #28

maning · 2016-02-09T10:03:35Z

There are cases where a different building category from Assessor data is assigned to two parts of a building.

talllguy · 2016-02-12T02:10:32Z

@maning in #3 (comment) what are the two categories*?

maning · 2016-02-12T02:19:01Z

@talllguy, the yellow = residential, purple = industrial. The area northwest is an industrial complex.

talllguy · 2016-02-12T03:10:14Z

@maning interesting. I suspect a zoning boundary caused this. Having worked for a County gov't GIS team, I wouldn't be surprised if such a method was used to tag building uses by an assessor. You might try loading the zoning boundary if you can find it on LA county's open data site to compare.

My prediction is that the zoning line will transect the building right where the split is, because zoning lines were historically mapped on small scale maps by hand, long before GIS was around. Then when GIS came around, GIS analysts were forced to digitize the zoning lines exactly where they were on the map, because changing them at all requires a law change. I digress. Check that zoning line and then if it is the culprit, you might just want to us an algorithm to decide to merge the larger piece, or ignore these zoning inferred tagging.

Also, regarding institutional, my county used that as well. In my import, I believe I changed them all to building=yes, because there was no positive way to say what they translated to. Typically they're school, college or government though.

maning · 2016-02-12T04:51:46Z

suspect a zoning boundary caused this. Having worked for a County gov't GIS team, I wouldn't be surprised if such a method was used to tag building uses by an assessor. You might try loading the zoning boundary if you can find it on LA county's open data site to compare.

You are correct we used LA County's Assessor data by joining the building shapefile and assessor csv using AIN as the join field attribute.

Also, regarding institutional, my county used that as well. In my import, I believe I changed them all to building=yes, because there was no positive way to say what they translated to. Typically they're school, college or government though.

You are correct again, building=institutional is not an OSM tag. For this, we created a lookup table by using secondary use type of the building. For example, in the assessor data, there a generaluse=building, specificuse=school we used the tag building=school for such cases. See: https://github.com/osmlab/labuildings/tree/master/mappings_csv

maning · 2016-03-01T12:39:47Z

Closing.
For building parts where Assessor data assigned different classification, importer should do manual merge.

almccon mentioned this issue Dec 12, 2014

How to divide up the tasks? #4

Closed

almccon mentioned this issue Dec 17, 2014

Where to share processed data? #7

Closed

cityhubla mentioned this issue Dec 31, 2014

Conflate addresses? #2

Closed

almccon added a commit that referenced this issue Jan 10, 2015

Start to describe which fields to import for #3

7c47822

almccon added a commit that referenced this issue Jan 15, 2015

Convert addresses for #3. Deal with all fields correctly.

bd3d598

almccon mentioned this issue Aug 23, 2015

Modify convert.py to translate assessor data into OSM tags #14

Closed

cityhubla mentioned this issue Aug 25, 2015

Define update strategy #6

Closed

This was referenced Dec 31, 2015

Building parts instead of building footprints? #19

Closed

Suppress Pasadena #20

Closed

maning closed this as completed Mar 1, 2016

maning mentioned this issue Apr 6, 2016

Building shapes split in two by parcel, city or some other boundaries #71

Closed

almccon mentioned this issue May 7, 2016

Can we figure out building type using some other dataset? almccon/bellingham-wa-buildings#6

Closed

Fields to be imported? #3

Fields to be imported? #3

Comments

bcamper commented Oct 8, 2014

almccon commented Oct 8, 2014

almccon commented Dec 12, 2014

bcamper commented Dec 12, 2014

sevko commented Dec 12, 2014

almccon commented Dec 16, 2014

bcamper commented Dec 16, 2014

bcamper commented Dec 16, 2014

almccon commented Dec 16, 2014

cityhubla commented Dec 16, 2014

almccon commented Dec 28, 2014

cityhubla commented Jan 26, 2015

cityhubla commented Jan 26, 2015

cityhubla commented Jan 26, 2015

cityhubla commented Mar 25, 2015

cityhubla commented Mar 25, 2015

almccon commented Mar 26, 2015

almccon commented Aug 23, 2015

almccon commented Aug 23, 2015

joeyklee commented Aug 23, 2015

almccon commented Aug 23, 2015

joeyklee commented Aug 23, 2015

cityhubla commented Aug 23, 2015

cityhubla commented Aug 23, 2015

joeyklee commented Aug 23, 2015

cityhubla commented Aug 23, 2015

cityhubla commented Aug 23, 2015

cityhubla commented Aug 23, 2015

almccon commented Aug 23, 2015

jschleuss commented Aug 24, 2015

jschleuss commented Aug 24, 2015

almccon commented Aug 24, 2015

cityhubla commented Aug 25, 2015

cityhubla commented Aug 25, 2015

planemad commented Dec 30, 2015

maning commented Jan 15, 2016

maning commented Jan 18, 2016

maning commented Feb 9, 2016

talllguy commented Feb 12, 2016

maning commented Feb 12, 2016

talllguy commented Feb 12, 2016

maning commented Feb 12, 2016

maning commented Mar 1, 2016