-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregation duplicates points at tile borders #889
Comments
You could do WHERE the_geom_webmercator && _cdb_params.bbox
AND ST_X(the_geom_webmercator) < ST_XMAX(bbox) -- If you have the number it's better
AND ST_Y(the_geom_webmercator) < ST_YMAX(bbox) -- If you have the number it's better This approach a) only works for points; b) Some artefacts are to be expected if the right border of the most to the right tile falls exactly at the border of the screen; same for the bottom, since the tile covering the point will not be requested. Keeping in mind I don't know anything at all about the renderer, could it be easier to return the |
See #889 FOr centroid and point-grid the cartodb_id wasn't unique across tiles.
See #889 FOr centroid and point-grid the cartodb_id wasn't unique across tiles.
I'm closing it by now because we can handle the duplicated border points in the clients easily. |
I'm reopening this because I think we should avoid the duplicate points: it would avoid potential confusions in the use of the tiles, and, commented above shouldn't have a performance impact (if we add the filtering condition in addition to the existing one which uses indices). As stated above, border points are assigned to the correct cluster so that only in one tile the resulting cluster belongs to the tile. But the other cases are not only out of the tile, but can contain incomplete aggregations (only the border points are aggregated). This is not a problem if the client filters the out-of-the-tile clusters geometrically, but it could be a problem if duplicates are avoided by cartodb_id instead. In any case it seems inconsistent to include those partially aggregated clusters in the tile. |
I've realised that the simple filtering described above is only valid with no tile extent/buffer, or when the MVT extent is a multiple of the aggregation resolution. To exclude all aggregation cells that are only partially included in the bbox we must first compute CEIL(ST_XMIN(bbox)/res)*res AS xmin,
FLOOR(ST_XMAX(bbox)/res)*res AS xmax,
CEIL(ST_YMIN(bbox)/res)*res AS ymin,
FLOOR(ST_YMAX(bbox)/res)*res AS ymax And then filter with: (the_geom_webmercator && bbox) AND
ST_X(the_geom_webmercator) >= xmin AND
ST_X(the_geom_webmercator) < xmax AND
ST_Y(the_geom_webmercator) >= ymin AND
ST_Y(the_geom_webmercator) < ymax I've been trying this and I've found a problem when zoom level is high and the tile is away from the origin: in such cases,the inaccuracies in Mapnik's provided bbox can make us drop cells on the border of the area. This wouldn't happen if bbox was computed with full double accuracy, as For example, for the tile z=20, x=1000000, y=1000000, here's the difference between the minimum Y of the tile and the correct value -18181044.018313024:
This discrepancy makes the filtering formulas above skip the lower row of cells in the tile. Note: I have yet to test this with production Mapnik. Also, @Algunenano, I'll ask you to run some tests I have in your local configuration. Also to be able to execute the tests I've prepared accurately we need to fix the problem described in #1001 so I will include combine those fixes in a PR |
For a moment I thought that we better include all the cells intersected by the bbox, which would be more robust against bbox lack of precision (taking floor of min, ceil of max). But the problem is that the spatial filters that Mapnik wraps the queries with will prevent all the necessary data from being aggregated. |
I've notice the error in the example |
This applies the idea in my last comment and seems to work. Further testing will be needed, though. |
See CartoDB/carto-vl#91 (comment)
Our aggregation queries select the points in the tile with this filter:
So, points at tile borders can be selected in multiple tiles.
The aggregation expression, since it uses
Floor
functions should never aggregate a point to the wrong cluster, so those border points will appear clusters outside the tile in all except one tile.We should fix the queries to avoid returning clusters that don't belong to the tile.
This can be done either by properly filtering points to be aggregated or by filtering after aggregation.
In the first case we could replace:
by:
But this could degrade the performance (or not, because the
the_geom_webmercator && _cdb_params.bbox
is added by Mapnik anyway in its query wrapping).A solution could be add both conditions (we'd need to check).
The other option would be to filter the aggregation coordinates, for example:
Which would not use indices, but would operate on relatively few rows...
The text was updated successfully, but these errors were encountered: