Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Android] Decouple runtime initialization and entry point execution for Android sample #111742

Merged
merged 11 commits into from
Feb 5, 2025

Conversation

matouskozak
Copy link
Member

@matouskozak matouskozak commented Jan 23, 2025

Problem Description

On Android, the startup is measured by scanning the logcat output for the following message:

Displayed <name> for user 0: <time>

In the case of dotnet/runtime sample app it would be e.g.:

Displayed net.dot.HelloAndroid/net.dot.MainActivity for user 0: +505ms

More information regarding startup measurements can be found at https://developer.android.com/topic/performance/vitals/launch-time#retrieve-TTID.

In the previous design, the onCreate method would finish before the runtime would get initialized (MonoRunner.initialize). This means that when we're reading the Displayed <name> for user 0: <time>, only the Java code had run, rendering all startup measurements useless as we couldn't measure any impact from our runtime.

Proposed Solution

By decoupling the initialization of runtime and execution of entry point, we can run the runtime initialization synchronously, forcing the onCreate callback to report the Displayed <name> for user 0: <time> after the runtime is fully initialized.

The execution of the entry point is still performed asynchronously to prevent the freeze of main thread for strenuous executables.

Note, this is solution doesn't replicate real-world scenarios, but it allows us to measure the startup time accurately with only the dotnet/runtime code.


Future Work

@akoeplinger
Copy link
Member

akoeplinger commented Jan 23, 2025

If I remember correctly this was intentional since MonoRunner.initialize doesn't just initialize the runtime, it essentially runs the full managed program until it exits so doing that in onCreate would freeze the UI if the program takes too long.

@matouskozak
Copy link
Member Author

matouskozak commented Jan 23, 2025

If I remember correctly this was intentional since MonoRunner.initialize doesn't just initialize the runtime, it essentially runs the full managed program until it exits so doing that in onCreate would freeze the UI if the program takes too long.

Indeed, that is what I observed as well. The startup time reported would be more TTFD than TTID. We could probably split

MonoDomain *domain = mono_jit_init_version ("dotnet.android", "mobile");
assert (domain);
MonoAssembly *assembly = mono_droid_load_assembly (executable, NULL);
assert (assembly);
LOG_INFO ("Executable: %s", executable);
int res = mono_jit_exec (domain, assembly, managed_argc, managed_argv);
LOG_INFO ("Exit code: %d.", res);
mono_jit_cleanup (domain);
to a separate function call and call just this one asynchronously. (if the UI freeze is a problem for the dotnet/runtime sample app)

@matouskozak
Copy link
Member Author

/azp run runtime-androidemulator

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@matouskozak matouskozak force-pushed the synchronize-runtime-init-android branch 2 times, most recently from 83ea6f5 to f77a717 Compare January 23, 2025 15:24
@matouskozak matouskozak changed the title [Android][wip] Synchronize onCreate and runtime init for Android [Android] Decuple Android runtime initialization and entry point execution Jan 23, 2025
@matouskozak matouskozak force-pushed the synchronize-runtime-init-android branch from f77a717 to c4e4b64 Compare January 23, 2025 15:32
@matouskozak
Copy link
Member Author

/azp run runtime-androidemulator

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@matouskozak matouskozak changed the title [Android] Decuple Android runtime initialization and entry point execution [Android] Decuple runtime initialization and entry point execution for Android sample Jan 23, 2025
@matouskozak
Copy link
Member Author

/azp run runtime-android

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@matouskozak matouskozak added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Jan 24, 2025
@matouskozak matouskozak marked this pull request as ready for review January 24, 2025 09:24
@vitek-karas
Copy link
Member

Just curious: I understand that with this change we will now measure startup as "the time it takes for the runtime to init far enough to be able to run some code" - but it won't run "Main" yet. I'm afraid that comparing this to CoreCLR will be tricky, since the two runtimes will probably do different things in "init" versus "run main". For example, I'm pretty sure CoreCLR actually runs managed code during "init". It's possible mono delays it until "run main".

I think a much more representative measurement would be to log something from managed Main - and measure how long it takes to do that - in that case we're measuring the same behavior not relying on runtime specific implementation differences.

@matouskozak
Copy link
Member Author

matouskozak commented Jan 24, 2025

I think a much more representative measurement would be to log something from managed Main - and measure how long it takes to do that - in that case we're measuring the same behavior not relying on runtime specific implementation differences.

I would rather avoid any custom logging inside the dotnet/runtime sample. By using custom log message, we would have to frame the performance startup scripts around the custom logging. Currently the scripts work by detecting the Displayed <name> for user 0: <time> which is a default standard working with any APK file.

Just curious: I understand that with this change we will now measure startup as "the time it takes for the runtime to init far enough to be able to run some code" - but it won't run "Main" yet. I'm afraid that comparing this to CoreCLR will be tricky, since the two runtimes will probably do different things in "init" versus "run main".

I definitely agree with this part. Right now, I'm not 100% sure how to solve this other than synchronizing both runtime initialization and execution of main.

For example, I'm pretty sure CoreCLR actually runs managed code during "init". It's possible mono delays it until "run main".

After the #110471 gets merged I can run it locally for CoreCLR and see what numbers we get.

@matouskozak
Copy link
Member Author

One posibility to explore is using TTFD for the dotnet/runtime sample instead of TTID. We could call the reportFullyDrawn() after the completion of main execution and use this time as "startup" for dotnet/runtime Android sample.

TTID: Displayed net.dot.HelloAndroid/net.dot.MainActivity for user 0: +433ms
TTFD: Fully drawn net.dot.HelloAndroid/net.dot.MainActivity: +1s397ms

@ivanpovazan
Copy link
Member

ivanpovazan commented Jan 24, 2025

Just curious: I understand that with this change we will now measure startup as "the time it takes for the runtime to init far enough to be able to run some code" - but it won't run "Main" yet. I'm afraid that comparing this to CoreCLR will be tricky, since the two runtimes will probably do different things in "init" versus "run main". For example, I'm pretty sure CoreCLR actually runs managed code during "init". It's possible mono delays it until "run main".

The comparison won't be 100% "fair" and that is fine. We use Hello* samples measurements to get the general idea about performance between runtimes, while the actual real measurement/comparison comes with dotnet new maui/ios/android samples, where full target-language interop is included. Additionally, we get Hello* measurements much sooner as we depend only on dotnet/runtime, so their "real" value is in early regression detection.

One example of another "unfair" measurement is for size for HelloiOS apps between NativeAOT and Mono where on Mono we can't strip symbols in the same way: and this gives us a comparison of: 1MB vs 5MB, which isn't really what we are getting in reality - MAUI apps :)

The only problem with TTID/TTFD and HelloAndroid is that we will have noise from system/Java in the numbers, but again with this change we will be in much better position than what we have currently.

Having said all this, I agree with the change. For more precise measurements of "when is the Main executed" we can execute measurement manually.

Move mono_jit_init_version and mono_droid_load_assembly to mono_droid_runtime_init
type safety around strings
@matouskozak matouskozak force-pushed the synchronize-runtime-init-android branch from 9501676 to d2f611e Compare January 31, 2025 15:59
@matouskozak
Copy link
Member Author

/azp run runtime-android,runtime-androidemulator

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@@ -41,11 +41,13 @@ protected void onCreate(Bundle savedInstanceState)
}

final Activity ctx = this;
MonoRunner.initializeRuntime(entryPointLibName, ctx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this splitting these actions across two threads now? I guess it is also possible this thread is also executing the handler below post other actions. I'm not sure the Android Activity life cycle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding, both initializeRuntime and the handler below will run on the same thread (cc: @jonathanpeppers). The motivation behind splitting and using the post handler is that the onCreate from MainActivity can finish and report "Displayed..." to the logcat which is captured for TTID measurement. Currently, the TTID should include all the Java parts + the initializeRuntime. The post handler below will execute with a delay (after onCreate finished) and run the executable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if you should remove the new Handler(Looper.getMainLooper()).postDelayed(new Runnable() call completely. .NET for Android apps do not call post() to run user code on startup.

This also would make TTFD less important, and maybe you wouldn't even need to record that number. TTFD will include time Android spends doing things that we have little control over.

The reason you have for using postDelayed() is:

The execution of the entry point is still performed asynchronously to prevent the freeze of main thread for strenuous executables.

We are doing this in the product in customers' apps, so seems reasonable to do it in this sample as well.

Removing postDelayed() could be done in another PR, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonathanpeppers for the comment and insights.

Indeed, .NET Android follows a different approach to how the runtime is initialized and code ran but I wasn't sure if the effort to rewrite the dotnet/runtime sample is worth for our use case. Splitting the initialization and execution of managed code seemed like a good compromise to get some meaningful information from TTID and TTFD. However, we could definitely create a proposal for the re-work if that's wanted.

void
Java_net_dot_MonoRunner_setEnv (JNIEnv* env, jobject thiz, jstring j_key, jstring j_value)
{
LOG_INFO ("Java_net_dot_MonoRunner_setEnv:");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assert that coreclr hasn't been initialized yet. In fact, it might make sense to return a non-zero value if someone calls this and coreclr has been initialized. The setenv is a tricky API and rather untrustworthy from a threading perspective.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it important that the runtime hasn't been initialized yet for setting env variables?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption here is that this is primarily for setting environment variables for use during coreclr initialization, that could be wrong. The issue with that is there are some environment variables that aren't obvious when they are read/consumed and it can cause confusion if all DOTNET_ related variables are set a priori.

On a general note for setenv, it isn't thread safe. In coreclr we have a big lock around it for that very reason. However, we can't control other callers of that API and one can get into weird situation when reading the environment block and other threads are setting it. This could have been addressed by the Android system, but as far as I'm aware it is still a common issue with the API not being thread safe.

Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@ivanpovazan ivanpovazan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work! :)

Java_net_dot_MonoRunner_initRuntime (JNIEnv* env, jobject thiz, jstring j_files_dir, jstring j_cache_dir, jstring j_testresults_dir, jstring j_entryPointLibName, jobjectArray j_args, long current_local_time);
Java_net_dot_MonoRunner_initRuntime (JNIEnv* env, jobject thiz, jstring j_files_dir, jstring j_cache_dir, jstring j_testresults_dir, jstring j_entryPointLibName, long current_local_time);

// Signature for parity with monodroid.c (and monodroid-coreclr.c), entry point is not used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is not correct, as we do call invoke_netlibrary_entrypoints from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it was meant to hint that the signature arguments are not really used as we invoke the entry point from the library. I'll remove the comment to not introduce confusion there.

@matouskozak
Copy link
Member Author

/azp run runtime-android,runtime-androidemulator

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Co-authored-by: Ivan Povazan <55002338+ivanpovazan@users.noreply.github.com>
@matouskozak
Copy link
Member Author

/azp run runtime-android,runtime-androidemulator

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@matouskozak matouskozak merged commit c9af66c into dotnet:main Feb 5, 2025
149 checks passed
grendello added a commit to grendello/runtime that referenced this pull request Feb 6, 2025
* main: (23 commits)
  add important remarks to NrbfDecoder (dotnet#111286)
  docs: fix spelling grammar and missing words in clr-code-guide.md (dotnet#112222)
  Consider type declaration order in MethodImpls (dotnet#111998)
  Add a feature flag to not use GVM in Linq Select (dotnet#109978)
  [cDAC] Implement ISOSDacInterface::GetMethodDescPtrFromIp (dotnet#110755)
  Restructure JSImport/JSExport generators to share more code and utilize more Microsoft.Interop.SourceGeneration shared code (dotnet#107769)
  Add more detailed explanations to control-flow RegexOpcode values (dotnet#112170)
  Add repo-specific condition to labeling workflows (dotnet#112169)
  Fix bad assembly when a nested exported type is marked via link.xml (dotnet#107945)
  Make `CalculateAssemblyAction` virtual. (dotnet#112154)
  JIT: Enable reusing profile-aware DFS trees between phases (dotnet#112198)
  Add support for LDAPTLS_CACERTDIR \ TrustedCertificateDirectory (dotnet#111877)
  JIT: Support custom `ClassLayout` instances with GC pointers in them (dotnet#112064)
  Factor positive lookaheads better into find optimizations (dotnet#112107)
  Add ImmutableCollectionsMarshal.AsMemory (dotnet#112177)
  [mono] ILStrip write directly to the output filestream (dotnet#112142)
  Allow the NativeAOT runtime pack to be specified as the ILC runtime package (dotnet#111876)
  JIT: some reworking for conditional escape analysis (dotnet#112194)
  Replace HELPER_METHOD_FRAME with DynamicHelperFrame in patchpoints (dotnet#112025)
  [Android] Decouple runtime initialization and entry point execution for Android sample (dotnet#111742)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants