Skip to content

Commit 664c428

Browse files
committed
initial commit
0 parents  commit 664c428

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+24200
-0
lines changed

.travis.yml

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
language: go
2+
3+
go:
4+
- 1.9
5+
- tip
6+
7+
after_script:
8+
- FIXED=$(go vet ./... | wc -l); if [ $FIXED -gt 0 ]; then echo "go vet - $FIXED issues(s), please fix." && exit 2; fi
9+
- FIXED=$(go fmt ./... | wc -l); if [ $FIXED -gt 0 ]; then echo "gofmt - $FIXED file(s) not formatted correctly, please run gofmt to fix this." && exit 2; fi
10+
11+
script:
12+
- go test -v ./...

LICENSE.md

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
The MIT License (MIT)
2+
3+
Copyright (c) 2017 Paul Mach
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy of
6+
this software and associated documentation files (the "Software"), to deal in
7+
the Software without restriction, including without limitation the rights to
8+
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9+
the Software, and to permit persons to whom the Software is furnished to do so,
10+
subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md

+262
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,262 @@
1+
osmzen [![Build Status](https://travis-ci.org/paulmach/osmzen.png?branch=master)](https://travis-ci.org/paulmach/osmzen) [![Godoc Reference](https://godoc.org/github.com/paulmach/osmzen?status.png)](https://godoc.org/github.com/paulmach/osmzen)
2+
======
3+
4+
This is a port of [tilezen/vector-datasource](https://github.com/tilezen/vector-datasource) developed by
5+
[Mapzen](https://mapzen.com/). It converts [Open Street Map](https://www.openstreetmap.org/) data
6+
directly into GeoJSON with properties that are understood by [Mapzen house
7+
styles](https://mapzen.com/products/maps/).
8+
9+
10+
A Postgres database is not required to evaluate the logic that is originally defined in a combination
11+
of SQL and Python. This allows for the quick mapping of any OSM element(s) to a `kind`/`kind_detail`
12+
normalization. Such a normalization is non-trivial given the "diversity" of OSM tagging so projects
13+
like tilezen/vector-datasource (and may others) are necessary.
14+
15+
The port currently implements almost all features applicable to evaluating zoom 14+ tile data.
16+
These features include:
17+
18+
* all filter, min_zoom and output logic defined in the `yaml/*.yaml` files,
19+
* all transforms that apply, implementation specific data transforms are skipped,
20+
* the CSV matcher post processor to set the `scale_rank` and `sort_rank` properties,
21+
* geometry clipping and label placement logic.
22+
23+
A lot of post processors still need to be ported, but only a few of the missing ones apply
24+
to zooms 14+. Missing post processors include: landuse_kind intercuts, merging line strings
25+
and merging building with building parts.
26+
27+
It would also be nice to port some of the integration tests as they would give confidence that
28+
things are really working as expected. Right now there are just some unit tests and some
29+
high level sanity checks.
30+
31+
#### Changes from the original tilezen/vector-datasource
32+
33+
The goal is for there to be no functional differences for zooms 14+. The YAML definition files are
34+
unchanged, there a just a few minor changes to the post processor filtering in `queries.yaml`. See
35+
the [github diff](https://github.com/tilezen/vector-datasource/compare/master...paulmach:master).
36+
37+
The port is based off of [v1.4.0ish](https://github.com/tilezen/vector-datasource/releases/tag/v1.4.0)
38+
version of the vector-datasource. The [fork](https://github.com/paulmach/vector-datasource) or the
39+
[github diff](https://github.com/paulmach/vector-datasource/compare/master...tilezen:master) between
40+
it and upstream/master are kept at the intended "reference".
41+
42+
Usage
43+
-----
44+
45+
1. Load and compile the `queries.yaml`, `yaml/*.yaml` and `spreadsheets/*_rank/*.csv` files. This can
46+
be done by loading the files directly using the implied directory structure:
47+
48+
config, err := osmzen.Load("config/queries.yaml")
49+
50+
or if you want to use the "official" ported config files but don't want to distribute them with
51+
the binary you can make use of the `embeddedconfig` subpackage which uses
52+
[go-bindata](https://github.com/jteeuwen/go-bindata) to "compile in" the files:
53+
54+
config, err := osmzen.LoadEmbeddedConfig(embeddedconfig.Asset)
55+
56+
If there are mistakes in the YAML the error will contain a lot of information to help debug:
57+
58+
if err, ok := errors.Cause(err).(*filter.CompileError); ok {
59+
log.Printf("error: %v", err.Error())
60+
log.Printf("cause: %v", err.Cause)
61+
log.Printf("yaml:\n%s", err.YAML()) // chunk of marshalled YAML with the issue
62+
} else if err != nil {
63+
log.Printf("other err: %v", err)
64+
}
65+
66+
2. Process some OSM data:
67+
68+
data := osm.OSM{}
69+
layers, err := config.Process(data, geo.Bound(-180, 180, -90, 90), zoom)
70+
71+
// layers is defined as `map[string]*geojson.FeatureCollection`
72+
73+
Layers can also be processed individually:
74+
75+
featureCollection, err := config.Layers["buildings"].Process(data, zoom)
76+
77+
The result is a GeoJSON feature collection with `kind`, `kind_detail` etc. properties that
78+
are understood by [Mapzen house styles](https://mapzen.com/products/maps/).
79+
80+
## Example
81+
82+
A more complete example that loads a zoom 16 area from the OSM API and
83+
the processes the tile (minus error checking):
84+
85+
```go
86+
package main
87+
88+
import (
89+
"context"
90+
"encoding/json"
91+
"fmt"
92+
93+
"github.com/paulmach/osmzen"
94+
"github.com/paulmach/osmzen/embeddedconfig"
95+
96+
"github.com/paulmach/orb/maptile"
97+
"github.com/paulmach/osm"
98+
"github.com/paulmach/osm/osmapi"
99+
)
100+
101+
func main() {
102+
tile := maptile.New(19613, 29310, 16)
103+
104+
// load osmzen config
105+
config, _ := osmzen.LoadEmbeddedConfig(embeddedconfig.Asset)
106+
107+
// get osm data for a tile from the offical api.
108+
bounds, _ := osm.NewBoundsFromTile(tile)
109+
data, _ := osmapi.Map(context.Background(), bounds)
110+
111+
// process the data
112+
// The tile coords will be used to exclude include interesting nodes
113+
// and labels outside the tile.
114+
layers, _ := config.Process(data, tile.Bound(), tile.Z)
115+
116+
// pretty print the json
117+
pretty, _ := json.MarshalIndent(layers, "", " ")
118+
fmt.Println(string(pretty))
119+
}
120+
```
121+
122+
Implementation details
123+
----------------------
124+
125+
At a high level [tilezen/vector-datasource](https://github.com/tilezen/vector-datasource) filters and
126+
process's its data using the following steps:
127+
128+
1. find relevant elements for a layer using the SQL query defined in `data/{layer_name}.jinja`,
129+
2. filter the elements using filter *conditions* defined in `yaml/{layer_name}.yaml`,
130+
3. generate properties for each element using the matching filter's output *expressions*,
131+
4. apply *transforms* to each element independently,
132+
5. apply *post processes* to all the layers together.
133+
134+
The transforms and post processes that apply to each layer and zoom are defined in `queries.yaml`.
135+
For a lot more details see the official tilezen/vector-datasource [project
136+
overview](https://github.com/tilezen/vector-datasource/blob/master/CONTRIBUTING.md).
137+
138+
As this package is a port of that code it follows the same steps, except for step 1 since the data
139+
is passed in directly.
140+
141+
### Loading and compiling config
142+
143+
During the loading of the YAML+CSV config files everything is compiled to make sure all the
144+
expressions and function references are known. If there is a typo, or something new/unsupported, an
145+
error will be returned. See above for how to get useful information from the error. The initial
146+
compile step allows for the checking of config errors at startup. Also since the types are converted
147+
up front there is a nice performance boost of about 10x.
148+
149+
The filters and outputs defined in the `yaml/*.yaml` files are basically a set of statements that
150+
act like: "if the element tags look like this, output these kind, kind_detail, etc. properties".
151+
152+
The filters define a condition, yes/no matching, that evaluates into a boolean. During the compile
153+
step these are converted into concrete types that implement the `filter.Condition` interface. The
154+
interface is defined as:
155+
156+
type filter.Condition interface {
157+
Eval(*filter.Context) bool
158+
}
159+
160+
The output for each filter defines what properties should be assigned to the element's GeoJSON
161+
feature. They output things such as booleans (is_tunnel), strings (kind), numbers (area) or nil to
162+
be ignored. The interface is defined as:
163+
164+
type fitler.Expression interface {
165+
Eval(*filter.Context) interface{}
166+
}
167+
168+
type filter.NumExpression interface {
169+
filter.Expression
170+
EvalNum(*filter.Context) float64
171+
}
172+
173+
The `filter.NumExpression` is also implemented by expressions that must be a number (e.g. area,
174+
building height). Using it helps avoid a type indirection when we know we need numbers. For example
175+
the `min` and `max` expressions.
176+
177+
The `filter.Context` is passed in at runtime and contains info about the element being evaluated
178+
like the OSM tags and geometry. It also caches "expensive" things like the area and volume that can
179+
be used by multiple filters.
180+
181+
#### Transforms and post processes
182+
183+
After elements for a layer are matched and GeoJSON features are created, a set of transforms is
184+
applied. The transforms edit the element properties based on some logic, sometimes requiring the
185+
set of relations the original OSM element is a member of.
186+
187+
The **transforms** are matched while loading the config to a function of the form:
188+
189+
func(*filter.Context, *geojson.Feature)
190+
191+
Transforms can just change a feature, they can't remove a feature if it's "bad" for any reason, like
192+
too small for the zoom. Transforms also don't know about other features so they can't be used to
193+
remove duplicates or merge features, like parts of the same road. However, transforms can be used to
194+
do things like fix one-way direction, add the correct highway shield text, abbreviate road names,
195+
etc.
196+
197+
The **post processes** are compiled to load files and check the parameters. They are mapped to an
198+
object implementing the `postprocess.Function` interface defined as:
199+
200+
type postprocess.Function interface {
201+
Eval(*postprocess.Context, map[string]*geojson.FeatureCollection)
202+
}
203+
204+
The function takes all the layers as input. Some examples of post processing are clipping to the
205+
tile bounds, setting sort_rank and scale_rank, removing duplicate features, removing small areas,
206+
merging lines, etc.
207+
208+
### Evaluating some data
209+
210+
Once everything is all setup we can start evaluating data against the filters and apply the
211+
transforms and post processes. The input is OSM data, a bound, plus a zoom. The bound is used to
212+
clip geometry and check if a label should be included. The zoom is used to filter out
213+
things that are "too small" as defined by the `min_zoom` output in the `yaml/*.yaml` files. To
214+
include everything, use a high zoom, such as 20.
215+
216+
The evaluation proceeds in the following steps:
217+
218+
1. Convert OSM data to GeoJSON
219+
220+
The data is run through [osm/osmgeojson](https://github.com/paulmach/osm/tree/master/osmgeojson)
221+
which is a port of the [osmtogeojson](https://github.com/tyrasd/osmtogeojson) node.js library.
222+
This groups nodes into ways and ways into polygons. For example, we don't care about the 4 nodes
223+
that define a building, we just want the building polygon.
224+
225+
2. Run each OSM element GeoJSON feature through the filters
226+
227+
We find the first filter in each layer to match and then compute the filter's outputs. Note,
228+
that an element can match in multiple layers, for example a building polygon and a POI.
229+
The input and output are both GeoJSON, however, the input contains properties based on OSM tags,
230+
but the output has properties from the filter like the `kind` and `kind_detail` etc.
231+
232+
3. Apply the transforms
233+
234+
The new GeoJSON object is updated a bit. This can include reversing the geometry or simplifying
235+
the name.
236+
237+
4. Apply the post processes to all the layers.
238+
239+
The end result is a layer, or set of layers that match those produced by `tilezen`.
240+
Note that this whole process can be applied to a single element.
241+
242+
### Benchmarks
243+
244+
The first two benchmarks evaluate a single element against ALL the filters and outputs
245+
in that layer. Normally you can stop after the first match and only evaluate that one output.
246+
The second benchmark is more typical of normal usage and coverts data from a zoom 16 tile.
247+
248+
```
249+
BenchmarkBuildings-4 200000 9969 ns/op 1040 B/op 42 allocs/op
250+
BenchmarkPOIs-4 10000 171457 ns/op 6816 B/op 450 allocs/op
251+
BenchmarkFullTile-4 100 11292314 ns/op 3611916 B/op 26555 allocs/op
252+
```
253+
254+
These benchmarks were run on a 2017 MacBook Pro with a 3.1 ghz processor and 8 gigs of ram.
255+
No concurrency is used in this package.
256+
257+
#### This library makes use of the following packages:
258+
259+
* [github.com/pkg/errors](https://github.com/pkg/errors) - for rich errors with stack traces
260+
* [gopkg.in/yaml.v2](http://gopkg.in/yaml.v2) - YAML parsing
261+
* [github.com/paulmach/orb](https://github.com/paulmach/orb) - geometry area, centroid, clipping, etc.
262+
* [github.com/paulmach/osm](https://github.com/paulmach/osm)

0 commit comments

Comments
 (0)