-
-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GAP: Apparent problem with workspace initialization #27681
Comments
This comment has been minimized.
This comment has been minimized.
comment:2
Some additional observations:
|
Branch: u/embray/ticket-27681 |
comment:3
This seems to fit it, with the aid of #27680. Ironically, it was while testing #27680 that I discovered this issue, and at first I thought it was caused somehow by my changes in that ticket. But I can confirm that it's not related at all. The problem stems (for reasons I still do not understand and are disturbing) from the fact that libgap was actually being used in the process of initializing a new workspace by pexpect GAP. Somehow this was causing problems; perhaps related to package loading. It's still disturbing that this could cause such a problem, and I can't seem to reproduce the issue in other examples of using both GAP interfaces simultaneously. It must come down to some subtle resource contention issue within GAP itself, and we can make some progress toward avoiding it by taking more care not to use both GAPs simultaneously during workspace caching. New commits:
|
Commit: |
Dependencies: #27680 |
Author: Erik Bray |
comment:4
The attached fix worked around the issue superficially for one use case, but it still seems to be a problem after all as I am still able to reproduce it in other cases even with this fix. |
comment:5
I am not able to reproduce the problem in the ticket description. Is it platform-specific? |
comment:6
OK, I see it on Fedora 26 now (but not on Gentoo or Debian!). I don't even need to remove the workspaces. |
comment:7
already in 8.7beta - I think I saw this earlier. I'm sure it's a doctest-related race condition - I have no idea why GAP workspace is updated mid-test, this alone is fishy. |
comment:10
I'm not sure it has nothing to do with libgap. In 0b6d94ee I fixed a problem where libgap was being used during pexpect-gap initialization in the This seemed to alleviate the problem somewhat, but it seems to be more of a bandaid. What's really weird is that the way the problem manifests itself is in the form of some pretty severe memory corruption affecting just random things in Sage. And I would think that only the libgap interface would be capable of breaking things that badly. I haven't looked at the issue in a few weeks though so I don't have any good ideas at the moment. I need to start looking at it again... |
comment:11
As usual with race conditions, it's extremely fluid, changes somewhere unrelated might "fix" it. First of all, there is no reason for GAP workspace to get updated during this test, yet is updated (not only on systems where these errors are seen, it seems it happens everywhere). |
comment:12
After rebasing this branch on 8.8.beta4 I can't reproduce the issue anymore, at least not in the same way. I'll keep trying things though. |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
|
comment:14
Rebased the branch on 8.8.beta4. Did a full Can you try this updated version of the branch and see if it's fixed on Fedora as well? Anyways, try as I might and I can't get this to come back, at least not in the way I originally reproduced it. I'll try again on Cygwin too... |
comment:15
The error is still there, slightly different:
GAP workspace still changes, as before. |
comment:16
Note that if I wipe out ~/.sage/ then after that the 1st run the test passes, but on the 2nd, etc. runs it does not. |
comment:17
Right, that's the behavior I was getting before too, but now I'm not getting it anymore, even on the exact same system as before :( |
comment:18
How does one check the caller stack while running tests? I would like to see where --- a/src/sage/interfaces/gap.py
+++ b/src/sage/interfaces/gap.py
@@ -1575,6 +1575,11 @@ def gap_reset_workspace(max_workspace_size=None, verbose=False):
sage: sage.interfaces.gap.WORKSPACE = ORIGINAL_WORKSPACE
"""
# Create new workspace with filename WORKSPACE
+ import inspect
+ curframe = inspect.currentframe()
+ calframe = inspect.getouterframes(curframe, 2)
+ print('caller name:', calframe[1][3])
+
g = Gap(use_workspace_cache=False, max_workspace_size=None)
g.eval('SetUserPreference("HistoryMaxLines", 30)')
from sage.tests.gap_packages import all_installed_packages yes the GAP workspace is changed during the test run! |
comment:28
Replying to @dimpase:
I'm not sure how you concluded that, but it is a possibility I suppose. I'd be more concerned that it's a problem with forking, which does happen when running the doctests, but doesn't normally happen when running interactively. So it can make things difficult to reproduce. Nevertheless, I can't reliably reproduce the issue anymore either, and the one thing I did fix here seems to help a bit... |
Changed branch from u/embray/ticket-27681 to |
comment:30
I have just noticed this ticket. And it gave me a little bit of fun chasing it (I don't know how Volker merged it, it doesn't show where I expect). So I got that interesting failure in sage-on-gentoo
After some inspection
The leading In the meantime, should we amend the test so that |
Changed commit from |
comment:31
Just to document so I can point to it next time. In gap-4.10.1 you have the file
So if
and finally more interestingly we have this at line 1077
and in sage we have in
At this stage I don't know if that option is a remnant of gap-4.8 and lower or if it is really needed. |
comment:32
Follow up at #27878 |
comment:33
Using 8.8.rc2 with Python 2, I still consistently get the same error as in comment:15. This is on CentOS Linux release 7.6.1810 (Core). Using |
comment:34
I was able to reproduce this again in #18267 specifically. |
comment:35
Replying to @mwageringel:
Thanks for the info; that's interesting (and strange). I'm still struggling to find a way to reliably reproduce this, though it's good to know that |
comment:36
I've opened #28106 to track this issue further. |
Changed branch from |
Commit: |
comment:37
Reopening as it breaks the buildbot |
comment:39
Setting this back to positive review assuming that it'll be fixed in #28106 |
Changed branch from u/embray/ticket-27681 to |
Changed commit from |
Changed dependencies from 27680, 28106 to #27680 |
comment:41
Wrong ticket, sorry for the noise |
I don't fully understand the problem yet but there seems to be an issue in 8.8.beta2 (if not earlier) related to GAP workspaces.
The only way I've been able to reproduce it reliably is by running
and then
at some point during this test (possibly others but this is the only where I've reliably reproduced the issue) things start to go horribly wrong:
ISTM like maybe some memory corruption occurs as a result of trying to load a corrupt or partial GAP workspace. It's really bad, whatever it is.
I should add, if I subsequently run the test it passes. But then I can reproduce the problem again by deleting workspaces and re-rerunning the test.
Depends on #27680
CC: @jdemeyer
Component: packages: standard
Keywords: gap libgap
Author: Erik Bray
Branch:
ecacac8
Reviewer: Dima Pasechnik
Issue created by migration from https://trac.sagemath.org/ticket/27681
The text was updated successfully, but these errors were encountered: