Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Way to increase x64 GC aggressiveness? (as per x86) #7991

Closed
benaadams opened this issue May 2, 2017 · 23 comments
Closed

Way to increase x64 GC aggressiveness? (as per x86) #7991

benaadams opened this issue May 2, 2017 · 23 comments
Labels
area-GC-coreclr enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@benaadams
Copy link
Member

So I want to use x64 (including x64 server GC); but I'd like to turn up its recycle aggressiveness a little.

For example if I run the same server app as x64 it will hover around 1GB; whereas as x86 it will hover around 175MB (private working set). Actual active managed memory in both will be similar and an order of magnitude lower than x86.

Having 1GB ambient use isn't greatly a problem as am running on a server with plenty of RAM.

However, it makes it hard to determine correct multi-service loading/homing balance.

i.e. when working out where to re-home services across a cluster; the two major general coarse-grained metrics you'll use are the general behaviours around CPU use and RAM use; but with RAM use being much higher is can make estimating this harder and increasing service density more tricky.

I'm aware allowing more ambient RAM use means less frequent GCs; so it wouldn't apply to all services. Also I'm aware you can limit using Job objects.

However, I'm looking for something more of an api or configuration setting like Threadpool min threads; but for memory; where below that threshold its happy to do as it does now and above it moves into a more aggressive reclamation mode. Kind of like Job object but without the hard stop and without the extra setup.

Don't have a formal proposal on what the right approach would be; as I'm not sure. Hoping others might...

/cc @Maoni0

@benaadams
Copy link
Member Author

Should I file api request upstream in corefx?

@Maoni0
Copy link
Member

Maoni0 commented May 2, 2017

You don't need to file a request right now as this may not be an API - it could be a config if it's per process. Let's first discuss it a bit and iron out what kind of options might be best to provide to users.

This is something that I've been thinking about as job objects didn't work as well as I had expected as IIS/asp.net handles job objects in a weird way (originally I did this for Azure folks and it worked great for them as they just use them the normal way).

I was thinking of providing this as a config where you could say "I only want this process to use X% of the physical memory I have on the machine". And of course, X% may not be enough on small machines so you could see a lot more compacting GCs on them than on bigger machines. But you can get this info via ETW events (that tell you this GC happened because we were short on memory).

Now, would this be a hard stop? Do you want to actually get OOM when you are using more than X% even after we did a full compacting GC? This kind of things are up for discussion. I'd like to keep the GC configs few so if we are going to add more they should be truly justified.

Another possible option is to just specify the absolute amount that you want this process to use so you don't need to adjust the X% when the workload runs on a different machine. Of course there's pros and cons - the 2nd option is more adaptable if you do want to use more memory on bigger machines; the 1st option is in theory less work - you tune once and run everywhere. But that's in theory, in practice you could be in a situation that something on the machine is affecting your process - I've seen over and over again folks don't throttle on their servers.

I can also see this being an API if you want to provide more flexibility while you are running - if the process has distinct phases you could imagine during some phase you want the memory footprint to be small whereas in other phases you want to use much more. Again, I am not looking to provide something that's as flexible as possible - I would like something that makes sense for reasonable scenarios.

@benaadams
Copy link
Member Author

benaadams commented May 2, 2017

it could be a config if it's per process.

Works for me

Now, would this be a hard stop?

Not necessarily; but should free memory to OS (when free) if its gone passed it to try and bring it back down. Think OOMs are bad as they are exceptions no one expects and no one handles.

What would be easier? Can always be refined. If it OOMs then you'd always over state the value; which would reduce its usefulness.

Another possible option is to just specify the absolute amount that you want this process to use so you don't need to adjust the X% when the workload runs on a different machine.

I like the fixed value; as I know what the process does and how much memory it uses to do that. However could see others scaling up the machine/VM and expecting it to have more memory automatically.

If its config, could be both? X% is converted to fixed value internally at startup; MB value is kept as fixed value. Won't work for hot add of memory - but that's probably not a big issue.

I can also see this being an API if you want to provide more flexibility while you are running

That's definitely interesting; maybe a later phase?

@benaadams
Copy link
Member Author

Something like?

<gcTargetMaxMemory enabled="disabled|percentage|fixedMB" value="X" />  
Enables garbage collection that tries to maintain process memory usage under the target value. 
This may cause an increase in the number of collections your application will have.

Percentage: value is the percentage of the Machines total? memory.
fixedMB: value is a fixed megabyte amount of memory.

@Maoni0
Copy link
Member

Maoni0 commented May 4, 2017

We could support both configs (for the usage, you should only specify one of them). For the percentage, it could be either the percentage at startup or dynamically adjusted as the process runs. Of course the former would be easier. The latter is less deterministic - if other processes on the machine happen to take up more memory then you might be surprised.

Re throwing OOM or not - from GC's POV, it's a bit easier to throw OOM if we ever observe the memory usage is greater than what you specified but it's not that much more work to allow it to grow larger if need to. So it's really just about the usage. As you pointed out, throwing OOM might be the less attractive option.

@benaadams
Copy link
Member Author

Might be a use for both usecases? e.g. a scenario with a specific hard stop might be something with a security focus; opening an unknown zip file and purposefully dying if it hits a zip bomb rather than absorbing all available memory.

@ryanerdmann
Copy link

I have a scenario right now that would definitely benefit from having a fixed target value set, ideally without OOMs if the process went over the limit.

I have a process running on a server that's pretty densely packed. Like Ben mentioned above, I know what this process does and how much memory it needs, and would like to indicate a target memory limit that I know will keep it in a good state (and keep its neighbors happy). I could put it in a job object, but would prefer not to ever OOM. Being able to specify a fixed (but soft) memory target through the app.config, or even an environment variable, would be really valuable.

@benaadams
Copy link
Member Author

benaadams commented May 8, 2017

Perhaps a pattern like threadpool?

 |   normal behavior
min
 |   faster recycling/release
max
 X   OOM error passed this

min and max defaulting to max value

@Maoni0
Copy link
Member

Maoni0 commented May 18, 2017

Sorry I hadn't responded - but I have been pondering about this.

@ryanerdmann what you described sounds like you know how much memory your process consumes normally, but it may go over unexpectedly. Can you describe what you do when it goes over? If your process's mem usage is more than what it normally uses by quite a bit, on a densely packed server surely it will affect other processes.

There's another scenario which is multiple processes living on the same machine where one might be a dominant process that you want to consume the majority of memory on the machine. Our current tuning would make sense for that one (being more aggressive at 90+%, meaning we might start doing full blocking GCs more often). But there might be a few coexisting small processes and for those 90% might be too early to react (if 10% of available memory means 10GB, for processes that only have a 1GB heap it wouldn't really make sense to keep doing blocking GCs when there's still 10GB mem available. So for them, perhaps giving a config that specifies how much free memory (in absolute amount) there should be before GC starts being very aggressive makes sense.

@Maoni0
Copy link
Member

Maoni0 commented May 18, 2017

#6919

@benaadams
Copy link
Member Author

benaadams commented May 18, 2017

The scenario I'm thinking of for overflow is demand bursting; with a regular "known" (+ head room) lower level demand (for example on web site). Likely would have the upper bound unset.

Website under moderate load is X MB so set to 3 x X MB; but allow to burst to a front page of HN of demand at X GB for example.; but afterwards return to 3 x X MB.

Maybe a sliding value that decays back to min/baseline; so its not necessarily more constant and frequent GCs; as that will increase CPU work, which might not be desirable in the burst periods; but will try to free back to OS when the burst has lessened.

Then when approaching max (90%+) it moves to much more aggressive as now.

@JustArchi
Copy link
Contributor

JustArchi commented Jun 16, 2017

This is exactly what I was looking for in dotnet/coreclr#12286 - I don't want to hijack this issue with my own case, so I'm only going to add a few points I think are important.

I think it basically should work like soft-heap-limit from Mono - we declare expected physical memory usage by the process, until which server GC works the same as now, and if memory reaches the target, it tries to stop expanding (nearly at all cost) by cleaning more aggressively. If by any chance even after cleanup there is not enough room, runtime is free to ask for more memory to proceed with execution, without throwing any OOM (unless it runs out of memory entirely, like what happens right now). When its done, it tries to release all unused chunks above our limit immediately to the OS, while still keeping everything below target for future actions.

Expected usage is the same as above, when we expect from our application to use more or less X (percentage/megabytes), but we do want to allow it to go above the limit ("burst" mode) for a short period of time, if this is absolutely needed for operation. This is entirely different than putting hard limit on heap/process, which leads to OOM when app gets above it.

Thank you for considering this feature.

@benaadams
Copy link
Member Author

Could this make use of the changes for containers? (For a max mem config) dotnet/coreclr#10064

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@benaadams
Copy link
Member Author

I think all this goodness is in .NET Core 3.1+ so closing

@JustArchi
Copy link
Contributor

@benaadams What exactly can be used today in .NET Core 3.1+ for increasing x64 GC aggressiveness, if you don't mind answering?

The only settings I'm aware of are COMPlus_GCLatencyLevel and COMPlus_gcTrimCommitOnLowMemory.

@benaadams
Copy link
Member Author

Much more now; are documented here https://docs.microsoft.com/en-us/dotnet/core/run-time-config/garbage-collector

@JustArchi
Copy link
Contributor

Thank you a lot, I somehow missed that link. Cheers! 👍

@danstur
Copy link

danstur commented Oct 6, 2020

@benaadams From reading the documentation I don't see anything similar to mono's soft-heap-limit, but only hard limits.

Is there any way with 3.1 to allow the GC to use as much memory as it needs during exceptional circumstances, but afterwards when idling greatly reduce resource usage again?

@JustArchi
Copy link
Contributor

JustArchi commented Oct 6, 2020

@danstur export COMPlus_GCLatencyLevel=0 is the best you can do in .NET Core, it made a huge difference for me. You don't really need soft heap limit, if you want to artificially limit generations size then just specify hard limit instead, same deal.

This is my setup:

export COMPlus_GCHeapHardLimitPercent=75

export COMPlus_GCLatencyLevel=0
export COMPlus_gcTrimCommitOnLowMemory=1

@benaadams
Copy link
Member Author

Can set up a lot of the GC settings via csproj https://docs.microsoft.com/en-us/dotnet/core/run-time-config/garbage-collector

Would "High memory percent" be what you are looking for as soft-heap-limit?

High memory percent

Memory load is indicated by the percentage of physical memory in use. By default, when the physical memory load reaches 90%, garbage collection becomes more aggressive about doing full, compacting garbage collections to avoid paging. When memory load is below 90%, GC favors background collections for full garbage collections, which have shorter pauses but don't reduce the total heap size by much. On machines with a significant amount of memory (80GB or more), the default load threshold is between 90% and 97%.

The high memory load threshold can be adjusted by the COMPlus_GCHighMemPercent environment variable or System.GC.HighMemoryPercent JSON configuration setting. Consider adjusting the threshold if you want to control heap size. For example, for the dominant process on a machine with 64GB of memory, it's reasonable for GC to start reacting when there's 10% of memory available. But for smaller processes, for example, a process that only consumes 1GB of memory, GC can comfortably run with less than 10% of memory available. For these smaller processes, consider setting the threshold higher. On the other hand, if you want larger processes to have smaller heap sizes (even when there's plenty of physical memory available), lowering this threshold is an effective way for GC to react sooner to compact the heap down.

@Maoni0
Copy link
Member

Maoni0 commented Oct 7, 2020

Is there any way with 3.1 to allow the GC to use as much memory as it needs during exceptional circumstances, but afterwards when idling greatly reduce resource usage again?

the problem is how you decide what "idling" is? you know when your app is idling but this is not something that's expressed to the GC. GC can try to figure it out but that's purely based on guesses - we could use the amount of allocations, or (more expensively) check the CPU time, or the time we last did a full blocking collection. none of these are ideal because situation could suddenly change drastically. now, if you do know the process will be idling, this is one of the few cases that warrant inducing a GC yourself.

@danstur
Copy link

danstur commented Oct 7, 2020

@Maoni0 Fair enough, it's definitely a hard problem to solve in general.

Is there a way to get the GC to release memory back to the OS? Doing a full GC with a commit of 1.5GB when only a 150MB are in use doesn't release memory. I have to run GC.Collect() multiple times in a row to give memory back.

@benaadams That seems reasonable, but even with that setting I can't seem to get the process to hand back memory.

@Maoni0
Copy link
Member

Maoni0 commented Oct 8, 2020

@danstur GC does release memory back to the OS naturally. this is described in this section of mem-doc. I'm not sure how counted the "150MB are in use", if you are counting it at the end of the GC this section would explain.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-GC-coreclr enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

6 participants