Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hybrid FVM compositional multiphase solver #901

Merged

Conversation

francoishamon
Copy link
Contributor

@francoishamon francoishamon commented Apr 22, 2020

This PR implements a mixed hybrid solver for compositional multiphase flow with buoyancy. I organized the compositional multiphase flow solver as follows:

  • A base class CompositionalMultiphaseBase that contains the property update functions and the computation of the accumulation term
  • A derived class CompositionalMultiphaseFVM that implements the cell-centered TPFA-MPFA flux assembly and boundary conditions.
  • A derived class CompositionalMultiphaseHybridFVM that implements the mixed hybrid flux assembly.

Note that the large number of lines changed is due to an integrated test using an imported mesh file (i.e, not actual code changes).

To-do list:

  • Mixed-hybrid compositional multiphase solver working
  • Integrated tests
  • Unit tests pass on Lassen
  • Implement MGR recipe
  • Add buoyancy

https://github.com/GEOSX/integratedTests/pull/119

@francoishamon francoishamon force-pushed the feature/hamon/mimeticCompositionalMultiphaseSolver branch 3 times, most recently from 7ce4890 to 97f3858 Compare November 3, 2020 00:16
@francoishamon francoishamon force-pushed the feature/hamon/mimeticCompositionalMultiphaseSolver branch from 97f3858 to d09c568 Compare November 4, 2020 05:41
@francoishamon francoishamon marked this pull request as ready for review December 22, 2020 23:19
Copy link
Contributor

@klevzoff klevzoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!
I hope the compile times are ok ;)

Comment on lines 157 to 163
FaceManager * const faceManager = meshLevel.getFaceManager();
{
faceManager->registerWrapper< array1d< real64 > >( viewKeyStruct::facePressureString )->
setPlotLevel( PlotLevel::LEVEL_0 )->
setRegisteringObjects( this->getName() )->
setDescription( "An array that holds the pressures at the faces." );
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as this is done in the base class, do we want to always register face pressure, even for classical FVM? It could be useful for face-based BC, but it's not clear if anyone would want to implement them in the compositional solver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to have face-based boundary condition for the classical FVM compositional solver as well. For instance, it would be convenient to have this face-based BC for a rigorous refinement study or a scalability test. That being said, I don't really realize how much work it would take to implement them for the compositional case. I can have look into that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that to have face b.c. for the compositional solver and use them for refinement studies we would have to specify all other fields as well (i.e., faceCompositions and eventually faceTemperature). I do agree that it could be useful though.

@@ -651,19 +601,21 @@ void CompositionalMultiphaseFlow::BackupFields( MeshLevel & mesh ) const
{
phaseDensOld[ei][ip] = phaseDens[ei][0][ip];
phaseVolFracOld[ei][ip] = phaseVolFrac[ei][ip];
phaseMobOld[ei][ip] = phaseMob[ei][ip];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is previous time step mobility used, if fluxes are always evaluated implicitly?
(I'm reading top to bottom, so hopefully I will find the answer below and won't forget to delete this comment.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did find the answer btw, it was in the residual normalizer

@francoishamon
Copy link
Contributor Author

I hope the compile times are ok ;)

@klevzoff here are some compilation timings to evaluate the impact of this PR:

I am doing:

make clean
time make -j

and reporting the real time below:

Quartz with gcc@8.1.0-release:
develop: 4m7.167s
feature/hamon/mimeticCompositionalMultiphaseSolver: 4m53.977s

Lassen with clang@upstream-release:
develop: 11m53.218s
feature/hamon/mimeticCompositionalMultiphaseSolver: 15m40.051s

and I do see on Travis that the Ubuntu18.04_clang8.0.0_cuda10.1.243_debug is hard to pass because it takes quite a bit of time.

One temporary thing that may increase the compilation time of the hybrid solver(s) is the switchyard over the number of faces before the kernel launch (but I am not sure):
https://github.com/GEOSX/GEOSX/blob/b93db15549e9b6af094e731b31995d92d2561b1f/src/coreComponents/physicsSolvers/fluidFlow/CompositionalMultiphaseHybridFVMKernels.hpp#L640-L641
which gives 3 possible values for the NF template (4, 5, 6) that have to be factored by the number of possible (NP,NC) pairs. But ultimately, for the general polyhedral case, NF will become the maximum number of faces (at least that was my plan, but I am not sure it is a good one), and I will remove the KernelLaunchSelectorFaceSwitch. Do you think that could decrease the compilation time? Or are there other things can be done to reduce the compilation time?

@klevzoff
Copy link
Contributor

klevzoff commented Jan 28, 2021

which gives 3 possible values for the NF template (4, 5, 6) that have to be factored by the number of possible (NP,NC) pairs

Yes, this is exactly why I asked about compile times. Thanks for checking!

But ultimately, for the general polyhedral case, NF will become the maximum number of faces (at least that was my plan, but I am not sure it is a good one), and I will remove the KernelLaunchSelectorFaceSwitch.

I'm not yet sure how well that's going to work. As we know, in general polyhedral grid number of faces can get quite large, and the kernel stack space requirements grow quadratically (NF x NF local arrays), so even with a limit of 12 or 16 faces it has the potential to severely limit the occupancy (number of thread blocks executing simultaneously) or even make the kernel impossible to launch on device (I've faced a similar issue with some of the ProppantTransport kernels).

I would be in favor of keeping the switchyard so we can keep optimal performance for the fixed element type grids (hex/tets/wedges). We could have the switchyard dispatch only the general polyhedral subregions to the "runtime" version of the kernel (one that takes maxNF as template parameter for stack array sizing and actualNF as runtime parameter). That said, I don't know whether it's possible to somehow share code between the compile-time-NF and runtime-NF versions, I've tried that before with NC/NP and haven't figure it out. Having to copy-paste code for such complex kernels is less than ideal.

Or are there other things can be done to reduce the compilation time?

One simple thing to try is forcing explicit template instantiations in .cpp files (similar to how you already do it for PhaseMobilityKernel, but adding NF), breaking them into multiple translation units (e.g. one per NF value) and relying on parallel build doing its thing. Of course, there's a limit to how much this can help - there's also the overhead of re-parsing the templated code and all included headers in each translation unit.

@francoishamon
Copy link
Contributor Author

francoishamon commented Jan 29, 2021

One simple thing to try is forcing explicit template instantiations in .cpp files (similar to how you already do it for PhaseMobilityKernel, but adding NF), breaking them into multiple translation units (e.g. one per NF value) and relying on parallel build doing its thing. Of course, there's a limit to how much this can help - there's also the overhead of re-parsing the templated code and all included headers in each translation unit.

@klevzoff I just pushed a new version of the code with explicit template instantiation to see if I can now pass the Ubuntu18.04_clang8.0.0_cuda10.1.243_debug test (failing before because the compilation was too slow). On Quartz and Lassen I observed a reduction of the compilation times on my branch (I re-ran the timings for develop as well because there is apparently a slight variation from one day to the other on Quartz):

Quartz with gcc@8.1.0-release:
develop: 3m47.451s
feature/hamon/mimeticCompositionalMultiphaseSolver: 3m50.207s

Lassen with clang@upstream-release:
develop: 11m52.883s
feature/hamon/mimeticCompositionalMultiphaseSolver: 12m29.325s

To make this work, I moved all the kernel functions to CompositionalMultiphaseHybridFVMKernels.cpp/hpp (which is now pretty big ;) ) because I had ptxas compilation/linking errors on Lassen otherwise.

One thing that I did not manage to do is to make the explicit template instantiation work with the "fourth" template on the type of inner product in FluxKernel::launch. The compilation was fine, but I was getting linking errors that I could not resolve. I don't understand why, because the explicit template instantiation is working fine with the STENCIL_TYPE in the FVM solver, and I was doing (almost) the same thing... So for now I commented out the dispatch before the FluxKernel::launch in CompositionalMultiphaseHybridFVM.cpp and if I pass all the Travis tests like this, I will do another try with the inner product template.

@francoishamon
Copy link
Contributor Author

One simple thing to try is forcing explicit template instantiations in .cpp files (similar to how you already do it for PhaseMobilityKernel, but adding NF), breaking them into multiple translation units (e.g. one per NF value) and relying on parallel build doing its thing.

@klevzoff I managed to reintroduce the four templates that I initially had in FluxKernel::launch when you reviewed this PR (inner product type, number of faces, number of components, number of phases). To make sure that compilation times do not explode on Lassen:

  1. I use explicit template instantiation for all the flux assembly kernel functions that I introduced in this PR.
  2. I limit the number of possible inner product types that can be used in CompositionalMultiphaseHybridFVM.

With these two changes, the compilation times on Lassen are now fine (comparable with develop) and I can pass all the Travis checks, Thanks for your help!

@klevzoff
Copy link
Contributor

klevzoff commented Feb 4, 2021

2. I limit the number of possible inner product types that can be used in CompositionalMultiphaseHybridFVM.

Does it significantly hurt our capability to experiment with different variations of hybrid FVM? Seems like you've put a good amount of work into it and it would be a shame to have that code sit unused.

This is probably something to do in separate PR (this one should go in soon!), but we can still try multiple translation units idea, unless you already have. Basically, it would go this way (all names are arbitrary):

  • The definitions of FluxKernel and INST_FluxKernel macro go in another header, something like CompositionalMultiphaseHybridFVMKernels-IMPL.hpp which is not meant to be included in the solver
  • Then there are multiple cpp files, say one per inner product type, e.g. CompositionalMultiphaseHybridFVMKernels-INST-TPFA.cpp, CompositionalMultiphaseHybridFVMKernels-INST-QuasiTPFA.cpp, etc., each including the -IMPL.hpp file above and then listing the INST_FluxKernel macro calls for that specific type. This is what gives us compilation speedup as each of these cpp files can be processed by make -j in parallel.
  • All of the above perhaps goes in its own subdirectory under physicsSolvers/fluidFlow to avoid clutter
  • Optional, but we could also have a config-time option, GEOSX_HYBRIDFVM_EXTENDED_IP that controls how many of those instantiations get included in the build and in the mimeticDispatch. I'm not sure this is very useful, as it would have to always be ON for most developers who run tests, but it would be a way to control compilation times somewhat.

Again, this is just an idea to try in the future. This PR is good as it is!

@francoishamon francoishamon added ci: run CUDA builds Allows to triggers (costly) CUDA jobs flag: requires rebaseline Requires rebaseline branch in integratedTests labels Feb 23, 2021
@francoishamon
Copy link
Contributor Author

  1. I limit the number of possible inner product types that can be used in CompositionalMultiphaseHybridFVM.

Does it significantly hurt our capability to experiment with different variations of hybrid FVM?

@klevzoff Thanks for the plan outlined above, I am going to start working on it as soon as this PR is merged (hopefully today). I think limiting the number of inner product types as temporary workaround is fine for now, as long as we have the TPFA inner product and a consistent inner product, which is the case in the current version of this PR.

@francoishamon francoishamon merged commit a243521 into develop Feb 23, 2021
@francoishamon francoishamon deleted the feature/hamon/mimeticCompositionalMultiphaseSolver branch February 23, 2021 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci: run CUDA builds Allows to triggers (costly) CUDA jobs flag: requires rebaseline Requires rebaseline branch in integratedTests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants