Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS thread randomly freezes for 5-20 seconds when using Hermes #295

Closed
SimpleCreations opened this issue Jul 15, 2020 · 11 comments
Closed

JS thread randomly freezes for 5-20 seconds when using Hermes #295

SimpleCreations opened this issue Jul 15, 2020 · 11 comments
Labels
question Further information is requested

Comments

@SimpleCreations
Copy link

Bug Description

After switching to Hermes I noticed an unusual behavior. Every couple of minutes my RN app stops responding to user interactions, more specifically the JS thread freezes for a couple of seconds. Meanwhile everything than runs natively on Android side (scrolling, typing text, etc.) works just fine, only JS is blocked from execution. After 5-20 seconds it "unfreezes" and all the code that was supposed to be run while the app was unresponsive (while the user was tapping different things) executes immediately. Then everything goes back to normal until the next freeze.

This stops reproducing after switching from Hermes to JSC.
Also I've noticed that using the version that supports Proxy (0.4.2-rc1, which I need because the app uses MobX 5) makes the freezes occur more frequently.

What I observed so far is:

  • The problem is not affected by dev mode (persists in release builds)
  • It is not specific to my device (reproduced on both Samsung device running Android 10 and Xiaomi device running Android 7.1.2)

I don't know at all how to debug this issue or how to narrow it down. It doesn't seem to be caused by any specific user behavior inside the app - it's usually enough to just launch the app and keep it open for a couple of minutes.
There are no errors, no crashes, nothing in Logcat, etc. The app just stops responding and then goes back to normal.

That said, I cannot provide any reproducible demo, but I am willing to help finding the issue. I just need help with finding what causes it. Maybe there is some debug info from Hermes internal that I can collect?
What also makes it harder to narrow it down is that the app I'm working on is relatively big (over 20 000 lines of own code and a lot of dependencies).

Hermes version: 0.3.0, 0.4.1, 0.4.2-rc1
React Native version (if any): 0.62.2
Android version (if any): 10.0, 7.1.2
Platform (most likely one of arm64-v8a, armeabi-v7a, x86, x86_64): arm64-v8a

@tmikov
Copy link
Contributor

tmikov commented Jul 15, 2020

I think the best way to find out for sure is to break with the debugger when such a pause occurs and inspect the call stack of the JS thread. Unfortunately this is not so easy to do at the moment, because Hermes ships without debug symbols. We are preparing a patch release (v0.5.1) where the symbols will be available in the release tarball. It should be out in a few days at the most. Note that you will have to upgrade to RN 0.63.

Without the stack trace, I would speculate that a possible reason could be excessive allocation. Can you try calling HermesInternal.getInstrumentedStats() periodically and log the results?

@tmikov tmikov added the need more info Awating additional info before proceeding label Jul 15, 2020
@SimpleCreations
Copy link
Author

I put HermesInternal.getInstrumentedStats() into 3 second interval and logged the results.
Here's what I've collected:

https://pastebin.com/kjczA8Ks
See line 208 (15:11:18.097). The thread was unresponsive for an entire minute, and then all the messages that were supposed to be recorded while it was frozen got all flushed at once.
I can see that the heap grew by 100 Mb, some other numbers (like js_markStackOverflows) increased as well, but I'm not sure what they mean.

https://pastebin.com/3eRQyvyC
Line 277. 3-6 second freeze. The heap grew by 60 Mb.

https://pastebin.com/Rcd9MYWw
Line 162. 20 second freeze. Heap growth: 130 Mb.

In all of these cases nothing was done inside the app - no heavy computations or anything, it was just open.

@tmikov
Copy link
Contributor

tmikov commented Jul 16, 2020

Thanks for collecting the logs, they are very useful! So far our working theory is that your heap size is very close to the limit of 512MB, causing a lot of garbage collection cycles. We are discussing internally the best way to address it. There is currently not easy way to change the heap limit without rebuilding either Hermes or RN. Are you able to do any of these and trying again?

@dulinriley
Copy link
Contributor

dulinriley commented Jul 16, 2020

Hi @SimpleCreations, thanks for the detailed information. It looks like you're experiencing a lot of full collection cycles, which will completely halt the JS thread until they complete. The pause will increase with the heap size, so as the heap gets bigger, the pauses will take longer.

Your first example (line 208), shows an increase in a few stats:

  • "js_numGCs": 5359 -> 5370
  • "js_gcTime": 62.4 -> 122.9 seconds
  • "js_heapSize": 415 MiB -> 513 MiB
  • "js_markStackOverflows": 905 -> 1538

What these mean is that during that minute pause you experienced, some JS code allocated (at least) 100 MiB of live data. I believe the current max heap size is around 512 MiB, so trying to allocate 100 MiB into the old gen will hit the max. Once we hit the max heap size, we have to do a full collection to reclaim some space. In that pause, it appears about all of the time was spent in doing multiple consecutive full GCs. This is the worst possible case you can experience.

@dulinriley
Copy link
Contributor

@SimpleCreations there's something you can do in the meantime while waiting for the size bump to fix your issue.

Hermes supports the Chrome Debugger Protocol, and can also profile memory usage with both heap snapshots and timeline profiles.

Follow the instructions here: https://reactnative.dev/docs/hermes#debugging-hermes-using-google-chromes-devtools to connect Chrome to your running Hermes instance.
Once you have it connected, click on the "Memory" tab, and you'll see these buttons:

Screen Shot 2020-07-16 at 1 50 26 PM

Click on "Allocation instrumentation on timeline" and make sure the "Record allocation stacks" box is checked.
Then perform the series of interactions that cause the pauses.
You can click the big red recording button to stop the profiler, and it'll have a list of objects that were allocated.

Sorting by "Retained Size" will show the largest objects allocated since profiling began (including the things they point to).
Expanding the object type will show you a list of objects allocated in that type category. Click on an individual object to see the stack frame where it was allocated. See this screenshot for more detail:
Screen Shot 2020-07-16 at 1 57 14 PM

You can change the view from "Summary" to "Allocation" and see which functions are allocating the most memory.
You can also save this profile as a file on your computer, and load it back into Chrome whenever you want.

Let me know if you run into any trouble generating that profile.

@SimpleCreations
Copy link
Author

SimpleCreations commented Jul 17, 2020

First of all, thank you for the detailed explanation why this was happening.

After playing around with debugger I found what caused the memory leak.
It was this package I was using:
https://www.npmjs.com/package/react-native-content-loader
which is an old fork of react-native-svg-animated-linear-gradient. The issue was fixed in the original repo, and before the fix it looked like this:
https://github.com/FullstackStation/react-native-svg-animated-linear-gradient/blob/e87d8fc6a948f2590840737f3dc5b669222c5406/src/index.js
The animation continued to play internally even when the component was unmounted (there was no componentWillUnmount).

After resolving this, the memory leak was gone (the heap is around 50 Mb right now and doesn't grow) and I haven't experienced a single pause since.
There is still one thing I'm not entirely sure of. I don't have that many of those content loaders - around 20 at most across the whole app. Even if the animation was running in the background, I don't understand why would it need to allocate hundreds of megabytes of memory and why wasn't the garbage collected properly.

Let me know if you run into any trouble generating that profile.

For some reason after I click the stop button it gets stuck at "Snaphotting..."
ss+(2020-07-17+at+04 49 52)

@dulinriley
Copy link
Contributor

For some reason after I click the stop button it gets stuck at "Snaphotting..."

When that happened, was the app still responsive? Or did it crash or not receive inputs?

At the top of the Chrome inspect window, there should be a three dot menu (next to a settings wheel and an X).
The second option in the menu will be "Show console drawer".
When it's stuck in the snapshotting state, there might be some red text in that console drawer describing the error.

@SimpleCreations
Copy link
Author

There are no crashes and no errors when it's stuck at "Snapshotting...", and the app is fully responsive. The console works too.

@tmikov
Copy link
Contributor

tmikov commented Jul 17, 2020

I think we should get to the bottom of the heap Snapshot Profiler not working, but since the immediate issue was resolved, we should probably open a new issue for it and continue the discussion there. In this way others can also find it and possibly comment.

@dulinriley
Copy link
Contributor

@SimpleCreations can you file a separate issue about the memory profiler not working? I'll investigate to see if I can reproduce the problem.

@dulinriley
Copy link
Contributor

Closing this as resolved

@dulinriley dulinriley added question Further information is requested and removed need more info Awating additional info before proceeding labels Jul 17, 2020
facebook-github-bot pushed a commit to facebook/react-native that referenced this issue Jul 23, 2020
Summary:
The max heap size currently is 512 MB for OSS apps using Hermes on Android.

Some users (see facebook/hermes#295) are experiencing
long pauses when reaching this ceiling.
Increasing this limit to 1 GB should reduce the frequency of these pauses occurring
for apps where the expected heap usage is near 512 MB.

Changelog: [Internal] Set Hermes's default max heap size to 1 GB

Reviewed By: mhorowitz

Differential Revision: D22577343

fbshipit-source-id: 2d7d688e38e95a082692dca52d010d0449a6e64b
@Huxpro Huxpro mentioned this issue Jul 20, 2021
1 task
facebook-github-bot pushed a commit that referenced this issue Aug 14, 2021
Summary:
The default max heap size GC config of Hermes, 512MiB, is too small to
be the default config for React Native apps from the nowadays standard.

It casued memory issues on Android (#295) and was mediated by manually
[overriding this number to 1024MiB from the RN Android code path](facebook/react-native@63d20d3),
and a similar OOM issue was reported on iOS recently (#551).

As Hermes being adopted by more platforms (Windows, macOS), it make no
sense to override this number again and again in each platforms.
Rather, the default max heap should be changed once and for all,
The RN side of change is tracked at facebook/react-native#31900.

We choose 3GiB which is "unlimited" in practice, making Hermes closer
to JSC which doesn't have such limit as well.

Reviewed By: neildhar

Differential Revision: D30181817

fbshipit-source-id: 54a442e43b7884a9f5d7ed099cb9b9866ee4f109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants