Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flask app can't debug within jpype #279

Closed
zhangjiajie023 opened this issue Dec 26, 2017 · 3 comments
Closed

Flask app can't debug within jpype #279

zhangjiajie023 opened this issue Dec 26, 2017 · 3 comments

Comments

@zhangjiajie023
Copy link

zhangjiajie023 commented Dec 26, 2017

I have a flask app, start JVM and attachThreadToJVM(),
when use app.run(threaded=True, debug=True) to debug the app, exception is as bellow.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fb74759d090, pid=8686, tid=140425282705152
#
# JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [_jpype.cpython-35m-x86_64-linux-gnu.so+0x65090]  JPJavaEnv::FindClass(char const*)+0x20
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# xxx/xxxx/xxx/bin/hs_err_pid8686.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

when use app.run( debug=True) to debug the app, exception is as bellow.:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f8cc4ccb7c8, pid=89303, tid=140242661144320
#
# JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 1.7.0_79-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [_jpype.cpython-35m-x86_64-linux-gnu.so+0x307c8]  JPJavaEnv::NewGlobalRef(_jobject*)+0x58
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid89303.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

threaded args means should the process handle each request in a separate thread?
so I think it is cased by JVM is not attached flask app threads.

@zhangjiajie023 zhangjiajie023 changed the title Gunicorn can't debug within jpype Flask app can't debug within jpype Dec 26, 2017
@Thrameos
Copy link
Contributor

Thrameos commented Feb 7, 2018

Best guess would be to enable a core dump and then load the core dump to get the backtrace. What is reported in the current log is in FindClass. So we can look at the possible sources there.

jclass JPJavaEnv::FindClass(const char* a0)
{     jclass res;

    JNIEnv* env = getJNIEnv();
    void* _save = JPEnv::getHost()->gotoExternal();

        res = env->functions->FindClass(env, a0);

    JPEnv::getHost()->returnExternal(_save);
    JAVA_CHECK("FindClass");
    return res;
}

Possible causes are JPEnv::getHost() returned null. This can only happen if the JVM is not started, but there are screens to prevent that which would hit much earlier. If it does happen then something would have had to corrupt the state by marking the jvm as shutting down leading to getHost becoming null while running. env->functions could be null. Again this table should be up after start jvm and should not ever be corrupted.

I am not aware of the functions in that layer of the code. It appears to be dealing with threading.

void* PythonHostEnvironment::gotoExternal()
{
        PyThreadState *_save;
        _save = PyEval_SaveThread();
        return (void*)_save;
}

void PythonHostEnvironment::returnExternal(void* state)
{
        PyThreadState *_save = (PyThreadState *)state;
        PyEval_RestoreThread(_save);
}

My best guess is that the JVM is being used from more than one thread in the app and that something has gone wrong in the bookkeeping in the jpype C++ module. I know that we have tests for threading safety, but this is not a feature that I have used.

The only other way to debug this would be to compile with the jpype tracing feature on. It may give you a log of what jpype calls were. The if the calls before the crash look suspect then we could figure out which call corrupted the state.

Looking a bit deeper it says pretty much what I though. Unless the jvm is null we can't hit that point. But it does show that jvm->functions cant be null itself as we would crash in this function. It

jint JPJavaEnv::GetEnv(JNIEnv** env)
{
        if (jvm == NULL)
        {
                *env = NULL;
                return JNI_EDETACHED;
        }

        // This function must not be put in GOTO_EXTERNAL/RETURN_EXTERNAL because it is called from WITHIN such a block already
        jint res;
        res = jvm->functions->GetEnv(jvm, (void**)env, JNI_VERSION_1_2);
        return res;
}

@marscher I do see one bug in this path. FindClass is assuming that env is a valid object but PJavaEnv::GetEnv can return JNI_EDETACHED. Nothing is checking for that error code thus it will segfault if we ever get to that point. We could add guard statements throughout the autogen file to prevent it, but I don't know where in the code path an error would get reported, nor how we would get into that state.

JNIEnv* JPJavaEnv::getJNIEnv()
{
        JNIEnv* env;
        GetEnv(&env);  <<< NOTHING IS LOOKING AT THE RETURN CODE
        return env;
}

This also looks like another case where we just have too many layers in the onion. The GetEnv call serves only the getJNIEnv call and thus is nothing more than a dead layer which hides the potential for errors.

$ find . -name "*.cpp" -o -name "*.h" | xargs grep GetEnv
./common/include/jp_javaenv.h:  jint GetEnv(JNIEnv** env);
./common/jp_javaenv.cpp:        GetEnv(&env);
./common/jp_javaenv.cpp:jint JPJavaEnv::GetEnv(JNIEnv** env)
./common/jp_javaenv.cpp:        res = jvm->functions->GetEnv(jvm, (void**)env, JNI_VERSION_1_2);
./jni_include/jni.h:    jint        (*GetEnv)(JavaVM*, void**, jint);
./jni_include/jni.h:    jint GetEnv(void** env, jint version)
./jni_include/jni.h:    { return functions->GetEnv(this, env, version); }

@marscher
Copy link
Member

we should definitely fix that getenv error code and raise something meaningful. Thanks for the catch!

@Thrameos
Copy link
Contributor

This should be correct in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants