-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for STR bulk packing #67
Conversation
The CI checks failed with a buffer overflow on Java6 However, all tests passed after a re-run... |
Codecov Report
@@ Coverage Diff @@
## master #67 +/- ##
============================================
+ Coverage 77.9% 78.33% +0.42%
- Complexity 640 641 +1
============================================
Files 57 57
Lines 1824 1869 +45
Branches 241 248 +7
============================================
+ Hits 1421 1464 +43
- Misses 355 357 +2
Partials 48 48
Continue to review full report at Codecov.
|
Thanks for the PR, looks interesting! I'll review this weekend. |
I fixed a bug but the CI failed again with openjdk6. After trying the updated workaround on travis-ci/travis-ci#5227 it seems to be working. |
What's your performance conclusion about using bulk loading?
…On Sat, 4 Mar 2017, 18:38 Ambling ***@***.***> wrote:
I fixed a bug but the CI failed again with openjdk6.
After trying the updated workaround on travis-ci/travis-ci#5227
<travis-ci/travis-ci#5227> it seems to be
working.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#67 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AATa6_db_RUuBX_YXmLI8Bxp8cBa7JY9ks5riRTdgaJpZM4MR-A7>
.
|
@davidmoten
The bulk loading is about 10x faster than the default RTree creation and about 20x faster than the R*-tree creation. The FlatBuffers-based RTree is much faster is because the creation process just loads the basic context info and one root node. Besides, it is interesting in the results that bulk packing with 70% full nodes is a little faster than that with 100% full nodes, while in fact there should be less nodes created if the nodes are 100% full. I may dig this deeper for the reason. As for the insertion/deletion/search performance on the STR-RTree, it would be reasonably worse than that of the R*-tree, but I haven't check this yet. And for geometry types other than the point, the performance of STR-RTree maybe even more worse, because the packing is based on just the central position, leading to more overlapping among nodes. So actually I still recommend the R*-tree implementation in most cases, but for very large dataset indexing or readonly applications the STR-Rtree can be offered as a good option. |
return packingSTR(nodes, false, size, context); | ||
} | ||
|
||
private class MidComparator implements Comparator<HasGeometry> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make static final
} | ||
|
||
private class MidComparator implements Comparator<HasGeometry> { | ||
private short dimension; // leave space for multiple dimensions, 0 for x, 1 for y, ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make final
/** | ||
* Create an RTree by bulk loading, using the STR method. | ||
* STR: a simple and efficient algorithm for R-tree packing | ||
* http://ieeexplore.ieee.org/abstract/document/582015/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add javadoc comment about the fact that the method mutates the input list
@@ -713,25 +755,33 @@ public void testStarTreeReturnsSameAsStandardRTree() { | |||
rectangle(1, 0.252, 50, 69.23), rectangle(13.12, 23.123, 50.45, 80.9), | |||
rectangle(10, 10, 50, 50) }; | |||
|
|||
List<Entry<Integer, Geometry>> entris = new ArrayList<Entry<Integer, Geometry>>(10000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entries
} | ||
|
||
@Override | ||
public boolean equals(Object obj) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you implementing equals
? I'd say remove.
Thanks for the review! I have made the changes, please have a look at that. |
Thanks @ambling. So I suppose the remaining task is to come up with some reasonably comprehensive advice about the pros and cons of using the STR bulk load. Would you be able to do a bit more perf testing so that we have something concrete to advise users? |
Yes @davidmoten . I think maybe I can add one or more test datasets and more benchmarking cases to verify the performance. I may finish the task in several days. |
Thanks @ambling. Let's merge this now and when you have more detail we'll add it the docs. Thanks very much for your work! |
For Issue #45
add support of Rtree creation through STR packing, according to the paper:
STR: a simple and efficient algorithm for R-tree packing
add a recursive function in RTree.Builder and set the default filling factor as 0.7 (1.0 is better for readonly RTree while 0.7 reserves space for insertion).
note that this implementation is only for 2D RTree.