Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for geojson #1147

Merged
merged 6 commits into from
Jan 17, 2025
Merged

Add support for geojson #1147

merged 6 commits into from
Jan 17, 2025

Conversation

HarelM
Copy link
Contributor

@HarelM HarelM commented Jan 14, 2025

This is a very similar to #413.
It uses geotools' geojson capabilities (very similar to shapefile).
I've added some minimal tests, I think it might be enough, but I'm not sure.
I was able to run the tests I added but I'm getting an error when running mvn clean install on the entire project.
Let me know if anything is missing.

@HarelM HarelM mentioned this pull request Jan 14, 2025
4 tasks
@HarelM
Copy link
Contributor Author

HarelM commented Jan 14, 2025

I think the tests are failing due to download issue from geofabrik:

Error: Exception in thread "main" java.lang.IllegalStateException: java.net.ConnectException: Connection refused
	at com.onthegomap.planetiler.util.Geofabrik.getAndCacheIndex(Geofabrik.java:59)

I'm not sure this is related to the changes I made...

Copy link

github-actions bot commented Jan 16, 2025

This Branch a2f0cc5 Base 59c3abd
0:01:06 DEB [archive] - Tile stats:
0:01:06 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (157k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:85k)
2. 9/154/190 (144k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:85k)
3. 10/308/380 (136k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
4. 10/308/381 (135k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:71k)
5. 14/4941/6092 (113k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:64k)
6. 14/4941/6093 (112k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (building:62k)
7. 14/4940/6092 (101k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
8. 11/616/762 (98k) https://onthegomap.github.io/planetiler-demo/#11.5/41.7057/-71.63086 (landcover:71k)
9. 14/4942/6091 (97k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
10. 11/616/761 (95k) https://onthegomap.github.io/planetiler-demo/#11.5/41.83679/-71.63086 (landcover:72k)
0:01:06 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  151   336   409   544   872   332   437   552   802  1.6k    2k  6.9k  6.2k  5.6k  4.5k  6.9k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   441   441   441   640   714    1k  1.6k  3.1k  5.8k  3.4k  1.7k   803   948  5.8k
            landuse    0     0     0     0   549   695  1.6k  6.7k   17k   44k   59k   50k   38k   19k   12k   59k
     transportation    0     0     0     0   313   776  1.2k    4k  5.6k   17k   13k   17k   62k   47k   33k   62k
           waterway    0     0     0     0   112   119     0     0     0    3k  2.3k    2k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.3k  4.3k  9.7k   18k   13k  8.2k  3.7k  3.4k  4.4k   18k
transportation_name    0     0     0     0     0     0   287   364  1.1k  1.9k  5.5k  4.7k  3.9k  3.4k   18k   18k
          landcover    0     0     0     0     0     0     0  9.9k   29k   85k   71k   81k   53k   30k   25k   85k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.3k  2.8k  1.4k  1.4k   869  4.3k
         water_name    0     0     0     0     0     0     0     0     0   486   461   433   452  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   328   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k  2.1k    3k  3.4k  2.8k  3.4k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   568   565   85k   85k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.4k  3.7k    6k   20k   41k   82k  195k  181k  134k  113k  127k  247k  247k
            gzipped 6.2k  3.5k  7.1k  5.2k  3.1k  4.8k   14k   29k   59k  144k  136k   98k   83k   91k  157k  157k
0:01:06 DEB [archive] -    Max tile: 247k (gzipped: 157k)
0:01:06 DEB [archive] -    Avg tile: 5.4k (gzipped: 4k) using weighted average based on OSM traffic
0:01:06 DEB [archive] -     # tiles: 4,115,039
0:01:06 DEB [archive] -  # features: 5,519,297
0:01:06 INF [archive] - Finished in 20s cpu:1m13s avg:3.7
0:01:06 INF [archive] -   read    1x(3% 0.6s wait:18s done:1s)
0:01:06 INF [archive] -   encode  4x(56% 11s wait:2s done:1s)
0:01:06 INF [archive] -   write   1x(21% 4s wait:13s)
0:01:06 INF [archive] - Finished in 1m7s cpu:3m40s gc:1s avg:3.3
0:01:06 INF [archive] - FINISHED!
0:01:06 INF [archive] - 
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - data errors:
0:01:06 INF [archive] - 	render_snap_fix_input	16,734
0:01:06 INF [archive] - 	osm_multipolygon_missing_way	360
0:01:06 INF [archive] - 	osm_boundary_missing_way	55
0:01:06 INF [archive] - 	merge_snap_fix_input	12
0:01:06 INF [archive] - 	feature_centroid_if_convex_osm_invalid_multipolygon_empty_after_fix	2
0:01:06 INF [archive] - 	render_snap_fix_input2	1
0:01:06 INF [archive] - 	omt_fix_water_before_ne_intersect	1
0:01:06 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	1
0:01:06 INF [archive] - 	feature_point_on_surface_osm_invalid_multipolygon_empty_after_fix	1
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - 	overall          1m7s cpu:3m40s gc:1s avg:3.3
0:01:06 INF [archive] - 	lake_centerlines 3s cpu:7s avg:2.4
0:01:06 INF [archive] - 	  read     1x(18% 0.5s done:2s)
0:01:06 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:06 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:06 INF [archive] - 	water_polygons   15s cpu:43s avg:2.8
0:01:06 INF [archive] - 	  read     1x(41% 6s done:7s)
0:01:06 INF [archive] - 	  process  4x(26% 4s wait:4s done:5s)
0:01:06 INF [archive] - 	  write    1x(3% 0.5s wait:10s done:5s)
0:01:06 INF [archive] - 	natural_earth    7s cpu:14s avg:2.1
0:01:06 INF [archive] - 	  read     1x(96% 6s)
0:01:06 INF [archive] - 	  process  4x(13% 0.8s wait:6s)
0:01:06 INF [archive] - 	  write    1x(0% 0s wait:6s)
0:01:06 INF [archive] - 	osm_pass1        2s cpu:6s avg:3.2
0:01:06 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:06 INF [archive] - 	  parse    4x(35% 0.7s)
0:01:06 INF [archive] - 	  process  1x(66% 1s)
0:01:06 INF [archive] - 	osm_pass2        18s cpu:1m13s avg:4
0:01:06 INF [archive] - 	  read     1x(0% 0s wait:11s done:8s)
0:01:06 INF [archive] - 	  process  4x(77% 14s)
0:01:06 INF [archive] - 	  write    1x(2% 0.4s wait:18s)
0:01:06 INF [archive] - 	ne_lakes         0s cpu:0s avg:0
0:01:06 INF [archive] - 	boundaries       0s cpu:0s avg:1.2
0:01:06 INF [archive] - 	agg_stop         0s cpu:0s avg:15.7
0:01:06 INF [archive] - 	sort             1s cpu:4s avg:2.6
0:01:06 INF [archive] - 	  worker  1x(50% 0.7s)
0:01:06 INF [archive] - 	archive          20s cpu:1m13s avg:3.7
0:01:06 INF [archive] - 	  read    1x(3% 0.6s wait:18s done:1s)
0:01:06 INF [archive] - 	  encode  4x(56% 11s wait:2s done:1s)
0:01:06 INF [archive] - 	  write   1x(21% 4s wait:13s)
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - 	archive	108MB
0:01:06 INF [archive] - 	features	284MB
-rw-r--r-- 1 runner docker 88M Jan 16 13:02 run.jar
0:01:06 DEB [archive] - Tile stats:
0:01:06 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (157k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:85k)
2. 9/154/190 (144k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:85k)
3. 10/308/380 (136k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
4. 10/308/381 (135k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:71k)
5. 14/4941/6092 (113k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:64k)
6. 14/4941/6093 (112k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (building:62k)
7. 14/4940/6092 (101k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
8. 11/616/762 (98k) https://onthegomap.github.io/planetiler-demo/#11.5/41.7057/-71.63086 (landcover:71k)
9. 14/4942/6091 (97k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
10. 11/616/761 (95k) https://onthegomap.github.io/planetiler-demo/#11.5/41.83679/-71.63086 (landcover:72k)
0:01:06 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  151   336   409   544   872   332   437   552   802  1.6k    2k  6.9k  6.2k  5.6k  4.5k  6.9k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   441   441   441   640   714    1k  1.6k  3.1k  5.8k  3.4k  1.7k   803   948  5.8k
            landuse    0     0     0     0   549   695  1.6k  6.7k   17k   44k   59k   50k   38k   19k   12k   59k
     transportation    0     0     0     0   313   776  1.2k    4k  5.6k   17k   13k   17k   62k   47k   33k   62k
           waterway    0     0     0     0   112   119     0     0     0    3k  2.3k    2k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.3k  4.3k  9.7k   18k   13k  8.2k  3.7k  3.4k  4.4k   18k
transportation_name    0     0     0     0     0     0   287   364  1.1k  1.9k  5.5k  4.7k  3.9k  3.4k   18k   18k
          landcover    0     0     0     0     0     0     0  9.9k   29k   85k   71k   81k   53k   30k   25k   85k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.3k  2.8k  1.4k  1.4k   869  4.3k
         water_name    0     0     0     0     0     0     0     0     0   486   461   433   452  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   328   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k  2.1k    3k  3.4k  2.8k  3.4k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   568   565   85k   85k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.4k  3.7k    6k   20k   41k   82k  195k  181k  134k  113k  127k  247k  247k
            gzipped 6.2k  3.5k  7.1k  5.2k  3.1k  4.8k   14k   29k   59k  144k  136k   98k   83k   91k  157k  157k
0:01:06 DEB [archive] -    Max tile: 247k (gzipped: 157k)
0:01:06 DEB [archive] -    Avg tile: 5.4k (gzipped: 4k) using weighted average based on OSM traffic
0:01:06 DEB [archive] -     # tiles: 4,115,039
0:01:06 DEB [archive] -  # features: 5,519,297
0:01:06 INF [archive] - Finished in 20s cpu:1m13s avg:3.7
0:01:06 INF [archive] -   read    1x(3% 0.6s wait:18s done:1s)
0:01:06 INF [archive] -   encode  4x(55% 11s wait:2s)
0:01:06 INF [archive] -   write   1x(21% 4s wait:14s)
0:01:06 INF [archive] - Finished in 1m7s cpu:3m40s gc:1s avg:3.3
0:01:06 INF [archive] - FINISHED!
0:01:06 INF [archive] - 
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - data errors:
0:01:06 INF [archive] - 	render_snap_fix_input	16,734
0:01:06 INF [archive] - 	osm_multipolygon_missing_way	360
0:01:06 INF [archive] - 	osm_boundary_missing_way	55
0:01:06 INF [archive] - 	merge_snap_fix_input	12
0:01:06 INF [archive] - 	feature_centroid_if_convex_osm_invalid_multipolygon_empty_after_fix	2
0:01:06 INF [archive] - 	render_snap_fix_input2	1
0:01:06 INF [archive] - 	omt_fix_water_before_ne_intersect	1
0:01:06 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	1
0:01:06 INF [archive] - 	feature_point_on_surface_osm_invalid_multipolygon_empty_after_fix	1
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - 	overall          1m7s cpu:3m40s gc:1s avg:3.3
0:01:06 INF [archive] - 	lake_centerlines 2s cpu:5s avg:2.4
0:01:06 INF [archive] - 	  read     1x(22% 0.5s done:2s)
0:01:06 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:06 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:06 INF [archive] - 	water_polygons   15s cpu:42s avg:2.8
0:01:06 INF [archive] - 	  read     1x(40% 6s done:7s)
0:01:06 INF [archive] - 	  process  4x(26% 4s wait:4s done:5s)
0:01:06 INF [archive] - 	  write    1x(4% 0.5s wait:10s done:5s)
0:01:06 INF [archive] - 	natural_earth    6s cpu:13s avg:2.1
0:01:06 INF [archive] - 	  read     1x(95% 6s)
0:01:06 INF [archive] - 	  process  4x(12% 0.8s wait:6s)
0:01:06 INF [archive] - 	  write    1x(0% 0s wait:6s)
0:01:06 INF [archive] - 	osm_pass1        2s cpu:6s avg:3.1
0:01:06 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:06 INF [archive] - 	  parse    4x(33% 0.6s)
0:01:06 INF [archive] - 	  process  1x(71% 1s)
0:01:06 INF [archive] - 	osm_pass2        19s cpu:1m16s avg:3.9
0:01:06 INF [archive] - 	  read     1x(0% 0s wait:12s done:8s)
0:01:06 INF [archive] - 	  process  4x(76% 15s)
0:01:06 INF [archive] - 	  write    1x(2% 0.4s wait:19s)
0:01:06 INF [archive] - 	ne_lakes         0s cpu:0s avg:0
0:01:06 INF [archive] - 	boundaries       0s cpu:0s avg:1.3
0:01:06 INF [archive] - 	agg_stop         0s cpu:0s avg:20
0:01:06 INF [archive] - 	sort             1s cpu:3s avg:2.5
0:01:06 INF [archive] - 	  worker  1x(50% 0.7s)
0:01:06 INF [archive] - 	archive          20s cpu:1m13s avg:3.7
0:01:06 INF [archive] - 	  read    1x(3% 0.6s wait:18s done:1s)
0:01:06 INF [archive] - 	  encode  4x(55% 11s wait:2s)
0:01:06 INF [archive] - 	  write   1x(21% 4s wait:14s)
0:01:06 INF [archive] - ----------------------------------------
0:01:06 INF [archive] - 	archive	108MB
0:01:06 INF [archive] - 	features	284MB
-rw-r--r-- 1 runner docker 87M Jan 16 13:03 run.jar

Full logs: https://github.com/onthegomap/planetiler/actions/runs/12809430105

Copy link
Contributor

@msbarry msbarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Newline-delimited geojson is also a common format supported by tools like tippecanoe, do you think there's any way for this to support that as well?

GeoJsonReader(String sourceName, Path input) {
super(sourceName);
store = new GeoJSONDataStore(input.toFile());
layer = input.getFileName().toString().replaceAll("\\.geojson$", "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see these having a .json extension too - could we default with just removing the file extension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 24ec3ba

Comment on lines 78 to 80
SimpleFeature simpleFeature = SimpleFeature.create((Geometry) feature.getDefaultGeometry(), HashMap.newHashMap(feature.getProperties().size()),
sourceName, layer, id++);
feature.getProperties().forEach(property -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: extract var properties = feature.getProperties(); once in case it's doing anything expensive to compute the properties map.

Also you can omit the curly braces on forEach(property -> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 636dcd1

Comment on lines +68 to +77
<dependency>
<groupId>org.geotools</groupId>
<artifactId>gt-geojson</artifactId>
<version>${geotools.version}</version>
</dependency>
<dependency>
<groupId>org.geotools</groupId>
<artifactId>gt-geojson-store</artifactId>
<version>${geotools.version}</version>
</dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see JTS already has a GeoJsonReader class. Geotools has some licensing issues so I'm trying to avoid adding new dependencies on it. But the JTS reader doesn't seem to handle properties at all so this might be the best one to start with...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, JTS is only for geometries, it doesn't handle feature and feature collection...
Geotools uses it under the hood...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the geotools version automatically handle parsing/converting CRS of input geojson?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clue, probably not, it is very similar to shapefile, I belive, in terms of API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to play with the Crs object of the store in various way but it seems to always return WGS84 even if the file said otherwise.
I've opened a ticket here:
https://osgeo-org.atlassian.net/browse/GEOT-7705
This can be read from the file itself, but it seems hacky...

Copy link
Contributor

@msbarry msbarry Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks! The jts version handles crs but not properties, and geotools version handles properties but not crs 🤦‍♂️ if there's just one crs defined for the entire object it wouldn't be too bad to read it out like jts does (https://github.com/locationtech/jts/blob/master/modules/io/common/src/main/java/org/locationtech/jts/io/geojson/GeoJsonReader.java#L445-L470), but we could also merge this version then handle crs and newline-delimited in a follow-up pr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think this is a good addition as is. We can always improve in the future.

@HarelM
Copy link
Contributor Author

HarelM commented Jan 16, 2025

For people using newline geojson it should be an easy task to add the surrounding feature collection I hope before running planetiler.
As far as I can tell geotools doesn't support newline geojson, but one could easily test this using the code I added here and find out.
We could try and create an issue there, not sure if it will help, we can also try and find a different geojson parser, I don't know, I'm not an expert in the Java ecosystem for GIS...

@msbarry
Copy link
Contributor

msbarry commented Jan 16, 2025

OK thanks! The main benefit of newline-delimited geojson is if the dataset is very large, it can be read line-by-line instead of loading the whole thing in memory. I could take a stab at auto-detecting which kind it is and handling newline-delimited variant in a followup PR...

@msbarry
Copy link
Contributor

msbarry commented Jan 17, 2025

Looks good now! Thanks for adding!

@msbarry msbarry merged commit 5588fca into onthegomap:main Jan 17, 2025
13 checks passed
@HarelM
Copy link
Contributor Author

HarelM commented Jan 17, 2025

Great! Let me know when a new version with this is available.

zstadler added a commit to zstadler/planetiler that referenced this pull request Feb 8, 2025
msbarry pushed a commit that referenced this pull request Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants