-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use osrm-datastore for testing, keep osrm-routed runnning #889
Conversation
I only tried the branch on Mac so far. |
There's a ton of failed cucumber tests on Travis, but they seem related to osmosis. However the build is still reported as passing - is that because the build itself succeds, and the cucumber result is not considered? |
Travis appears to have changed the environment. No idea why osmosis is broken there ATM. Nevertheless the Jenkins server does run the tests too |
I have tried this branch on home Ubuntu 12.04 and have multiple errors (manual loading and serving seems to work) On the first run (when files are generated and delays are big) first ~20 tests seem so pass, but on later tests and after restarting cucumber there are many failing tests followed by "osrm-routed is not running" errors. (I have 3Gb RAM machine with no swap partition, this may not be enough, but old-way tests pass without any problems) There should be some way to increase reliability of osrm-datastore and routed on frequent reloading... |
On Windows (waitpid-modified script from this branch and sources from #880) there are also correct results->then some incorrect results->then crashes of osrm-routed. |
doesn't really seem to work on Ubuntu (i'm running v 13).
|
This does not depend on test-running configuration, actuallly. Huge timeiout is not a way to solve this problem :) When the next request comes, the reloading process in osrm-routed is initialized, name of fileIndex file is correct... Errors may be the result of reading 1) previous data, 2) still changing data or 3) incorrectly loaded data. |
@alex85k I haven't looked at the code, but this sounds like race conditions to me. The swapping of the data in memory should be safeguarded by mutex's. |
@DennisOSRM, can you get the experimental/cuke_datastore branch to run tests succesfully on ubuntu? |
@emiltin will do tomorrow Morning. |
Sorry. Got delayed. Will get to that asap |
Tests run fine on my Ubuntu dev machine. First run takes 3m37s while the second (cached) run takes only 0m15s. The only downside is that the following warning is produced for every test: [warn] Process ../build/osrm-datastore could not request RAM lock I am not yet sure what the reason is. |
So, after digging a bit deeper I found why it is warning. The OS is not allowing to lock the data into RAM as it is hitting a limit. To view the limit try $ ulimit -l On my system it says $ sudo vi /etc/security/limits.conf and then add the following two lines at the bottom, where is your user name:
Login and out ( or even reboot ) and the warning should be gone. While the message is certainly nagging, it is a message that one could safely ignore during tests. |
interesting, because they don't run at all on my ubuntu machine. |
I am running the code from this pull request on Ubuntu 13.10. What kind of error do you get? |
here you can see the errors: editing /etc/security/limits.conf did not seem to make a difference, i'm still getting the warning, and ulimit -l still reports 64.
|
You need to replace |
oh.. i see! |
i got rid of the warning, by modiying /etc/security/limits.conf. but cucumber still reports tons of errors. if i run "cucumber -t @basic" (consisting of 11 scenarios), i will get anything from 2-9 failed scenarios, either because the routing is incorrect, or osrm-routed doesn't repond. sometimes osrm-datastore seems to hang for 10-30 seconds, making the whole machine unresponsive, as if huge amounts of memory is being allocated. |
i added some debug info, so you can see the order in which datastore is called, and routed is launched/shutdown. |
the develop branch runs all test without errors on my machine |
rebased on latest develop. (still same errors) |
when i run the cucumber tests, and then use 'rake pid' to monitor the osrm-routed process, i can see that it at some point it changes from mode S to mode Z (Defunct "zombie" process, terminated but not reaped by its parent.) from then on, cucumber reports '*** osrm-routed is not running' so it seems that reload data with osrm-datastore somehow causes osrm-routed to die? it would be nice if osrm-routed would output something to the log when new data has been loaded |
The point of the entire data store thingy is that |
yes it's odd |
I have seen the Defunct osrm-routed too when running those tests (I guess it was after getting segmentation faults)... |
I tried to compile and run the tests (cuke_datastore branch) on FreeBSD 10 virtual machine with CLang 3.3. All 251 tests passed without any shmem configuration. Second run took 1m17.311s (on VM, Core2Duo, 2Gb RAM) and did not show any errors (first time in my experience). This is extremely strange. |
Don't think this is related to boost. The testing code is in ruby and should not interfere with boost (as linked into the OSRM binaries). @emiltin Is the routed process dying from a segfault or is it because of some other exception? |
Errors are caused by routed faults, not testing environment... Now rebuilding with custom boost 1.55 on FreeBSD to check my hypothesis :) |
Cancelled the builds for 14eac50 to get results for the latest commit earlier. |
all green on travis. but we should still add a test for direct data load. |
AppVeyor is looking good, too, while it is halfway through. once we have a test for direct loading, this is looking really good to merge. We are close. |
I have tested this branch after rebase (on 6-core Xeon E5-1650 with SSD):
Thank you! |
added test of direct data load. this required some changes to the test infrastructure. you can now use Given the data is loaded directly or Given the data is loaded with datastore to specify for each scenario how data is loaded. To minimize the risk of hard-to-debug problems, only one instance of osrm-routed will be launched at the same time. the default is to use datastore to load data and osrm-routed running for all tests. but osrm-routed will be relaunched when needed, ie everytime a scenario uses direct data load, or you go from direct to datastore. Direct data is tested with these scenarios: |
Cool! |
@alex85k does the latest commit work on windows? |
21.5s on my linux box for 353 scenarios / 1467 steps :-) |
It should work, I'll check tomorrow. But why 0.1 timeout for shutdown is too big? There is only one shutdown in testing process, if I understand correctly. |
yes only used very seldom. but it's a retry delay, not a timeout. |
Seem to work fine on Windows with latest commit (partial run, but full should be the same) |
actually you need to be sure to include feature/testbot/load.feature as well as other tests, to make sure you cover loading data both with datastore and directly. but appveyor seems happy. |
uhm guess appveyor doesn't run the cucumber tests? |
not yet |
I had a prototype of testing environment for Appveyor, maybe now it can fit in time (at least some tests). |
Yay! for caching |
@alex85k could you provide the output of |
This time no test failures, of course :) (first run 27 min on Core2Duo, Debug) . Prevoius error was non-existing path. @DennisOSRM: are you sure that the error will not show up on some circular isolated road or so on? |
appveyor debug build failed due to 30 min timeout |
what's left to do? |
I think we are good to merge. Great job, everyone. |
use osrm-datastore for testing, keep osrm-routed runnning
Thank you! |
This branch uses osrm-datastore to load data during cucumber testing, resulting in a speed up of more than 3x on cached tests. (First run is about the same, since data must be converted with osmosis).
Before each scenario is run, osrm-datastore is used to load the new into shared memory. osmr-routed is then launched if it's not already running. As cucumber exits, osrm-routed is shutdown.
We might want to add some testing of the good ol' way of loading data directly with osrm-routed, since with the current version of this branch, osrm-routed never load data directly.
I did experience some weird behaviour when trying to launch osrm-routed manually from the command line, and then running osrm-datastore from the cuke scripts, including errors in datastore indicating failure to free data, osrm-routed not returning correct routes, and osrm-routed throwing exception. This should perhaps be investigated. But when launching both routed and datastore from the cuke scripts it seems to work fine.