-
Notifications
You must be signed in to change notification settings - Fork 217
Conversation
This adds a proposal for an early startup hook that allows managed code to run before the entry point.
@brianrob Does this look like something you would want to use for injecting Performance Controller into existing apps? |
|
||
Specifically, hostpolicy starts up coreclr and sets up a new AppDomain | ||
with `ManagedHost.dll` on the TPA list. If it read these settings from | ||
environment variables, it clears them so that child processes do not |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
## Example | ||
|
||
This could be used with AssemblyLoadContext APIs to resolve |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
Specifically, hostpolicy starts up coreclr and sets up a new AppDomain | ||
with `ManagedHost.dll` on the TPA list. If it read these settings from | ||
environment variables, it clears them so that child processes do not |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
The producer of `ManagedHost.dll` needs to ensure that | ||
`ManagedHost.dll` is compatible with the dependencies specified in the | ||
main application's deps.json, since those dependencies are put on the | ||
TPA list when the runtime is started. |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
Specifically, hostpolicy starts up coreclr and sets up a new AppDomain | ||
with `ManagedHost.dll` on the TPA list. If it read these settings from | ||
environment variables, it clears them so that child processes do not |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
## Example | ||
|
||
This could be used with AssemblyLoadContext APIs to resolve |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
centralized, while still allowing user code to do its own thing if it | ||
so desires. | ||
|
||
The producer of `ManagedHost.dll` needs to ensure that |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
``` | ||
{ | ||
"runtimeOptions": { | ||
"managedHostAssembly": "/path/to/ManagedHost.dll", |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
This would allow hosting providers to put configuration and policy in | ||
a single location, including settings that potentially influence load | ||
behavior of the main entry point such as the AssemblyLoadContext |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
## Proposed behavior | ||
|
||
Environment variables or `<appname>.runtimeconfig.json` can be used to | ||
specify a managed assembly and type that contains an `Initialize` |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
The environment variables, if set, take precedence over the | ||
`<appname>.runtimeconfig.json` settings. These settings result in | ||
`ManagedHostType.Initialize()` being called in hostpolicy, before the |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This does seem like something that could help with injecting the performance controller. One thing that comes to mind: This mechanism will probably need to support multiple entries into the start-up hook. People will use these hooks in different ways, and I can see them wanting to have two or more. For example, one hook is used to set assembly loading policy, while the other is used to start-up the performance controller or some other telemetry service before Main() is called. These two cases might be implemented by two different parties (say the assembly loading policy is implemented by the hoster while the performance controller policy is implemented by Microsoft). Thus, they can't be combined. |
My biggest feedback is the same as @brianrob, which is that users that are injecting their startup code have not reason to believe they are the ONLY ones. Thus the scheme has to work when more than one system is doing this. Thus it is not a DLL/Method name, but a LIST of those specifications. Thus I would recommend instead having NOT having a tuple, but a syntax. I would recommend that you have PATH!Type, but I would also recommend making a default type name (e.g. DotNetStartup). This allows people to only specify the PATH (since they will be building the type specificaly for this purpose, so can conform to whatever convention we wish). In either case as single string represents the 'beforeStartup' point. The environment variable/config entry is then a semicolon separated list of these. This does bring up the question of ordering of these things (everybody wants to be first). Certainly we would order them by the order in the semicolon delimited list, but if we get this data from multiple places (some from env vars, some from config), then it is not so clear. Even if it was clear, it is not obvious that the software updating the config/env vars knows enough to do a good job. To address this, we could have convention if the DLL name (before suffix) ends in a _N* where N is a digit, that it is a desire to be point in that order (lexographic) Thus X_12.dll comes before A_2.dll ...). The main point, however is that multiple 'things' will want to use this hook, and they need to be able to do so without collision. Comments? |
With respect to what @vancem says about ordering, there is precedent for using a priority value to handle ordering. GCC does this with usages of From https://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Function-Attributes.html:
I'd recommend making it part of the config and not part of the DLL name, just because it will be easier to change when needed. |
@vancem @brianrob we discussed multiple possible entries and decided to go with a similar model to what we have in .NET framework. https://msdn.microsoft.com/en-us/library/system.appdomainmanager(v=vs.110).aspx Do you know how the performance scenarios operate today in .NET Framework? Given this is an advanced scenario it may be that the developer has to wrap these 3rd party providers and call them in the order they choose explicitly. This is not ideal, but hopefully this is not incredibly common. |
@vancem @brianrob @jeffschwMSFT Having said that, we (service fabric) need a stop-gap solution where there is a single application/orchestration platform that launches the user micro-service and allow them to load shared service fabric assemblies from the platform (rather than app-local), as if we are carrying our own GAC for those user services. We are also hoping this feature can be back ported to .NET Core 2.0, so a simpler solution with less dependencies/complexity is preferable at this point. |
FWIW, I agree with @vancem that we should allow more than one of these to be specified (as you know I have mentioned it during our discussions). I do not think we need to have a way to tweak the order. The moment you get multiple of these and they are colliding, you are in trouble. |
Thanks all for the feedback! @brianrob, @vancem I can see why people would want multiple such hooks, and I don't think the native host is the right place. Rather, I'd hope that the ManagedHost.dll can set up further hooks in managed code if that's needed. This startup hook exists to solve the bootstrapping problem of "how can we (host provider or user code) control, in managed code, how the entry point dependencies are loaded, given that those dependencies are usually loaded before any managed code runs?". This hook makes that possible by separating the concepts of "user entry point" and "first managed code to run", while imposing some constraints on the host implementation (it needs to be compatible with the app's dependencies). In other words, this setting specifies which managed code runs first, and we're recommending it as a solution for folks who need to solve the bootstrapping problem because they don't control the managed entry point. Anyone who does control the entry point needn't do this - and anyone who uses this setting is expected to understand that it only works in a cooperative environment. I agree that if our recommendation is ignored (if a bunch of folks go and try to use this hook to control which managed code runs first), we'll hit issues like this. Perhaps we can design this in a way that would allow chained "first managed code" settings (which inevitably has the same problem - who is first of the first?), but I hope we never have to go there. @vancem How about this: we put the setting in one variable, so it just becomes a path to an assembly and a type. This would make it possible to extend with additional syntax if we ever need to. But in this iteration we only allow one "first managed code" setting. Thoughts? edit: thanks @AaronRobinsonMSFT for our chat in the hall about this! |
@yizhang82 I think the long-term solution that allows managed hosts to be shipped as nuget packages is just to provide a managed hosting library that has to be explicitly initialized from Main. This feature is intended more as the mentioned stop-gap solution for cases where, for whatever reason, the entry point is a bad place to put this. |
This startup hook is very powerful. I had seen people coming to me with all sorts of problems that could have been solved by it. I am pretty sure it will be used for more just the scenario motivating it.
I do not see this as stop-gap solution. If this is a stop-gap solution, we should not build this. |
@jkotas I should say "narrow use case" rather than "stop-gap" in my previous post. But if a hook like this is likely to be used for other reasons, I agree that we need to consider those. Do you have any examples of other problems solved by it, that wouldn't be better solved by adding managed hooks to user code? |
@sbomer - The obvious use case for this is telemetry injection. Tools which to be able to 'install' (wrap) an application so that it logs telemetry automatically. They can do all the magic once they are in the process, but they need to be started, and the sooner the better. Note these tools do not wish to modify the apps (but modifying config is acceptable). Note that there could be multiple telemetry tools, each of which want their hook. A Debugger is a kind of telemetry tool and would probably want its own hook. But even without two telemetry systems, the telemetry system wants to be able to inject even if you are already using the injection point. I know that there is a desire to keep things 'simple' but frankly we are not talking about alot of code here. If we simpy change the syntax (@sbomer seemed to be OK with that (It is better anyway, modifying one thing is less error prone that modifying two things). GIven that, then the actual implementation difference is trivial (you split the string and make it a loop). It is not hard. Yes, there are issues, but as you point out most of them go away in the singleton case. Even in the multi-client case, often it is not a problem (last modifier gets to make the choice (and it will probably be that his is first)). We can defer the question of priorities for now (the mechanism I described before can be added after the fact if we need it). So I don't see the problem. There is an obvious way forward. |
One requirement : (a) Is there any existing api in .Net Core that gives precedence to a path over app local (as mentioned in deps.json) to load dlls ? This is what GAC does in .Net Framework.
Without (a), upgrading to next Service Fabric runtime (which use this feature) will break existing apps. This is required for our Linux cluster where customers have deployed FDD coreclr apps and using dotnet installed by Service Fabric runtime. For new apps, we can change our SDK to put X.dll in ref folder of nupkg. X.dll will not become part of deps.json. So, this problem does not occur with new apps. |
@ashishnegi Why can't you leave the existing service fabric applications alone and not use any of this for them? |
@jkotas Leaving older coreclr apps as is mean : I believe one of the motivation for this startup hook is to solve following scenario :
As you see, Not breaking older coreclr apps, mean |
How are you dealing with this problem for the non-coreclr apps?
It sounds like that you are rolling forward all coreclr apps to the latest .NET Core runtime today. It has potential to break them. If you want to have high confidence in not breaking older coreclr apps, you should keep running them against the runtime that they were built for. We do not guarantee bug-for-bug compatibility between major .NET Core versions. How does this work for non .NET apps? If I have say Java service fabric app, are you going to force a particular Java runtime version on me, or can I choose which Java runtime to run on? |
In full desktop everything is loaded from GAC - even though some SDK assemblies are app-local they are still loaded from GAC. |
I meant non-.NET apps - e.g. Java apps? |
Java apps don't have this problem today because they are on an entirely different technology stack that is based on our native store/replicator/logger. The "managed" SDK layer is very thin and the majority work is done through JNI into native layer with a well defined ABI. So having everything app local works great there, and native libraries are loaded from platform path (from enviroment variable when process is launched) The new .NET SDK is also based on the native store/replicator/logger so it is also app-local and mostly p/invoke to native code. As you can see, the new model is better suited for CoreCLR and all "managed" componets can be app-local. However, the context here is the existing "old" .NET SDK where the implementation is mostly managed (yes, we have both managed and native implementation of the same thing, and the native implementation being the new thing) and exist in platform folder (internally we call it runtime folder but I intentionally avoid it here...), and that's where the app-local vs platform folder diverge happens. I can imagine if we built this technology entirely in Java it would have the same problem/requirement. |
of any startup hook. | ||
|
||
-- Invalid syntax. This will throw an `ArgumentException` with "The |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
existing structured dependency injection framework in place. An | ||
example of such a use case is a hook that injects logging, telemetry, | ||
or profiling into an existing deployed application at run tim.e |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
It is prone to ordering issues when multiple hooks are used, and does | ||
nothing to attempt to make dependencies of hooks easy to | ||
manage. Multiple hooks, if used, should be fairly independent of each |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
dependency manager that loads components in a specific order. If this | ||
kind of behavior is required, a proper dependency injection framework | ||
should be used instead of multpile startup hooks. |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@davidfowl similar issues exist for this startup hook. The hook on its own does not introduce any loading policy, so whoever is deploying the hook/app needs to ensure that their load policies work well together. This feature is more low-level than
I'm hoping that folks won't see this as a solution to their plugin problems. I think the use case for the startup hooks is a bit more narrow than that, and more work needs to go into providing a proper plugin model.
I added a section with some guidance to start with. Please let me know if you think something needs to be added, changed, clarified, etc.! |
|
||
Problems with the startup hook should be fairly straightforward to | ||
diagnose. All of these exceptionts will contain the startup hook path | ||
(`System.StartupHookProvider.ProcessStartupHooks`) on the stack |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
} | ||
``` | ||
|
||
## Error handling details |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
The type must now be called StartupHook, and can not be set by the environment variable. The descriptions and examples were updated to reflect this. Also: - add a few notes about the visibility of this type - describe behavior of uncaught exceptions in startup hooks - give example with platform-specific separators - fix some typos
@jeffschwMSFT please give this another look when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Instead of the host creating a delegate and inoking it, ExecuteMainMethod in the runtime will call the System.Private.CoreLib function that processes startup hooks. This allows the thread apartment state to be set based on any attributes in the Main method, before startup hooks execute.
in the `Main` method of the app, before startup hooks execute. As a | ||
result, attemps to explicitly set the thread apartment state in a | ||
startup hook will likely fail. | ||
|
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@@ -51,13 +51,14 @@ will be inherited by child processes by default. It is up to the | |||
desired. | |||
|
|||
Specifically, hostpolicy starts up coreclr and sets up a new | |||
AppDomain. It then invokes a private method in | |||
`System.Private.CoreLib`, which will call each | |||
AppDomain, passing in the startup hook variable if it was |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Two more concerns:
The takeaway is that we're back in a place where the startup hook alone won't be useful as a way to modify load behavior of dependencies of main on the TPA list. @yizhang82 for Service Fabric, it could be used to set a resolving event that applies to non-TPA assemblies. It sounds like this would still enable you to add new API surface for newly published apps by having them reference the new assembly, without putting it on the TPA list or deploying it app-local. Can you confirm that this is your intended use? |
@sbomer can we share the modified components with @yizhang82? I was talking with @jkotas about the Service Fabric scenario and it seems they may have a viable way of using this feature to enable their scenario. |
@jeffschwMSFT That's what I was mentioning. I'll get in touch with Service Fabric and see if they want to test some private bits. I'd still like confirmation from @yizhang82 that I've understood their use case. |
Talked to @yizhang82 - so far it sounds like we'll enable their scenario as I described. I'm going to work with them to test this, since this area has given us a few unexpected surprises so far... |
From a security perspective this could be a major risk factor. Malicious dll injection made easy? Are there any plans to protect against unwanted loading? |
These kind of vectors already exist in coreclr. See the |
* Add startup hook design document (#4421) * Add startup hook design document This adds a proposal for an early startup hook that allows managed code to run before the entry point. * Change env var behavior and clarify version constraints * Fix formatting, specify Initialize() signature * Update spec for multiple hooks in environment variable - Move the configuration to a single environment variable, DOTNET_STARTUP_HOOKS - Introduce syntax to allow multiple hooks to be defined - Remove .runtimeconfig file options This will only support the environment variable, since this is a low-level feature that should only be used when implicit injection of code is desired. - Update implementation details * Remove default type name and other fixes - Require an explicit type name for each startup hook - Fix Initialize() signature (it must be static) - Clarify that the Initialize() methods are called synchronously * Require full path, clarify syntax, and change TPA behavior - Relative paths to the assemblies are not allowed - Use platform-specific path separator - Clarify restrictions on the syntax (no empty entries, trailing separator) - Startup assemblies will not be placed on TPA list Instead, they will be loaded directly using LoadFromAssemblyPath in the implementation. * Add guidance, error behavior, and thread behavior * Remove specific error text, fix typos and formatting * Remove type name from synax The type must now be called StartupHook, and can not be set by the environment variable. The descriptions and examples were updated to reflect this. Also: - add a few notes about the visibility of this type - describe behavior of uncaught exceptions in startup hooks - give example with platform-specific separators - fix some typos * Change startup hook to run closer to Main Instead of the host creating a delegate and inoking it, ExecuteMainMethod in the runtime will call the System.Private.CoreLib function that processes startup hooks. This allows the thread apartment state to be set based on any attributes in the Main method, before startup hooks execute. * Clarify apartment threading state behavior * Specify the property name passed to the runtime * Add host startup hook and tests (#4465) * Expose coreclr_create_delegate function * Add startup hook call Read DOTNET_STARTUP_HOOKS environment variable, and pass it along to a private function in System.Private.CoreLib (StartupHookProvider.ProcessStartupHookS). * Add startup hook tests * Add some more syntax checks * Reorder the tests * Add test for exclamation mark in assembly path * Fix error behavior for missing methods MissingMethodException is now only thrown when the Initialize method doesn't exist at all. This also reorganizes the existing tests for the invalid signature, and adds some cases. * Change test behavior to not expect eager file exists check * Expect full paths for startup hook. Fixes this in existing tests, and adds a new check for the error behavior when a relative path is passed. * Add test for startup hook trace output * Update tests to use new syntax (without type name) * Pass STARTUP_HOOKS property key to coreclr_initialize STARTUP_HOOKS is now passed to the clr during initialization, instead of using create_delegate in the host. This will allow the clr to call the startup hooks closer to the execution of the main method, allowing the threading apartment state to be set based on the main method's attributes. * Allow non-public Initialize method * Remove coreclr::create_delegate * Add comments clarifying purpose of startup hook test projects * Add test with ALC assembly resolve event The startup hook will inject a dependency into the app. * Fix the ALC startup hook testcase - Don't set ReferenceOutputAssembly in the projectreference, because that results in it not being published. - Dispose the new projects - Don't throw exceptions from the resolver, since they are swallowed.
This adds a proposal for an early startup hook that allows managed
code to run before the entry point.
@jeffschwMSFT @vitek-karas @swaroop-sridhar @AaronRobinsonMSFT @richlander @jkotas