[vecgeom] Disable aggressive compiler optimizations #7992

smuzaffar · 2022-07-11T11:32:56Z

@hahnjo

cmsbuild · 2022-07-11T11:33:23Z

A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for branch IB/CMSSW_12_5_X/master.

@cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.
cms-bot commands are listed here

smuzaffar · 2022-07-11T11:33:52Z

please test

smuzaffar · 2022-07-11T11:34:29Z

assign simulation

cmsbuild · 2022-07-11T11:34:41Z

New categories assigned: simulation

@mdhildreth,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild · 2022-07-11T22:10:52Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a29b8e/26132/summary.html
COMMIT: c088ce8
CMSSW: CMSSW_12_5_X_2022-07-10-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/7992/26132/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a29b8e/26132/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a29b8e/26132/git-merge-result

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 55557 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 3655970
DQMHistoTests: Total failures: 305157
DQMHistoTests: Total nulls: 150
DQMHistoTests: Total successes: 3350641
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -0.020999999999999963 KiB( 49 files compared)
DQMHistoSizes: changed ( 10224.0 ): -0.352 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 11834.0 ): 0.464 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 250202.181 ): -0.117 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 25202.0 ): -0.012 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
Checked 208 log files, 45 edm output root files, 50 DQM output files
TriggerResults: found differences in 14 / 49 workflows

smuzaffar · 2022-07-12T06:21:04Z

please test

cmsbuild · 2022-07-12T17:02:40Z

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a29b8e/26157/summary.html
COMMIT: c088ce8
CMSSW: CMSSW_12_5_X_2022-07-11-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmsdist/7992/26157/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test runtestRecoEgammaElectronIdentification had ERRORS

Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 55553 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 3655970
DQMHistoTests: Total failures: 305155
DQMHistoTests: Total nulls: 149
DQMHistoTests: Total successes: 3350644
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: -0.016999999999999963 KiB( 49 files compared)
DQMHistoSizes: changed ( 10224.0 ): -0.352 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 11834.0 ): 0.464 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 250202.181 ): -0.117 KiB SiStrip/MechanicalView
DQMHistoSizes: changed ( 25202.0 ): -0.012 KiB SiStrip/MechanicalView
Checked 208 log files, 45 edm output root files, 50 DQM output files
TriggerResults: found differences in 14 / 49 workflows

smuzaffar · 2022-07-12T17:12:52Z

assign simulation
@civanch , building vecgeom with -O2 caused many comparison differences. Are these expected?

civanch · 2022-07-12T20:28:08Z

@smuzaffar , differences are uniformly distributed over WFs. All what I see is 0 entries to some histogram versus 1 entry. Differences are in tracker, in HLT, and in muons (am not sure). Much less in EGamma. My interpretation is that it is MC history difference - we have different MC events. The difference may be in a very local code but not easy to find where.

In such cases, we use to ask for PdmV validation - it is similar to compiler change. From common sense point of view -O2 should provide correct results but I afraid, we cannot change the master without PdmV limited test. In other IBs we may switch without a problem.

What I do not understand @hahnjo : are you sure that we should compile VecGeom with -O2 and keep Geant4 and DD4hep with -O3?

civanch · 2022-07-13T05:45:41Z

My previous comment was not precise. The number of differences in the case of sim history change should be much more. What we see likely is not history change - events are nearly the same but in some cases selection of data for histograms is different.

hahnjo · 2022-07-13T07:25:03Z

@smuzaffar @civanch I think the fact that there are differences shows how ubiquitous and critical this problem is. In theory, I would expect identical results when switching between -O2 and -O3, and I get this in practice on my system when compiling Geant4 and VecGeom with gcc8. However, because gcc10 at -O3 miscompiles VecGeom we get different results when switching to -O2 because it's actually doing different work.

are you sure that we should compile VecGeom with -O2 and keep Geant4 and DD4hep with -O3?

This should be fine, yes.

fwyzard · 2022-07-13T07:56:03Z

Out of curiosity, did you check which of the optimisation flags enabled by -O3 causes problems ?

According to the GCC 10 manual,

-O3 turns on all optimizations specified by -O2 and also turns on the following optimization flags:

-fgcse-after-reload 
-fipa-cp-clone
-floop-interchange 
-floop-unroll-and-jam 
-fpeel-loops 
-fpredictive-commoning 
-fsplit-loops 
-fsplit-paths 
-ftree-loop-distribution 
-ftree-loop-vectorize 
-ftree-partial-pre 
-ftree-slp-vectorize 
-funswitch-loops 
-fvect-cost-model 
-fvect-cost-model=dynamic 
-fversion-loops-for-strides

Also, does GCC 10.4 fixes the problem ?

hahnjo · 2022-07-13T08:00:53Z

Just to prove the point, I've now built Geant4 and VecGeom 1.1.18 (to avoid the crash with the change introduced with 1.1.19) and now I see slight differences between -O2 and -O3. So this does actually point to the miscompilation being solved.

@fwyzard

Out of curiosity, did you check which of the optimisation flags enabled by -O3 causes problems ?

No, I didn't so far...

Also, does GCC 10.4 fixes the problem ?

No, both vanilla GCC 10.2 and 10.4 show the problem in VecGeom, as does 10.3 used in CMSSW. GCC versions 9 and 11 are fine though...

hahnjo · 2022-07-13T08:45:41Z

@fwyzard

Out of curiosity, did you check which of the optimisation flags enabled by -O3 causes problems ?

According to the GCC 10 manual,
-O3 turns on all optimizations specified by -O2 and also turns on the following optimization flags:
-fgcse-after-reload 
-fipa-cp-clone
-floop-interchange 
-floop-unroll-and-jam 
-fpeel-loops 
-fpredictive-commoning 
-fsplit-loops 
-fsplit-paths 
-ftree-loop-distribution 
-ftree-loop-vectorize 
-ftree-partial-pre 
-ftree-slp-vectorize 
-funswitch-loops 
-fvect-cost-model 
-fvect-cost-model=dynamic 
-fversion-loops-for-strides

FWIW it's got something to do with -fsplit-loops, switching to -O3 -fno-split-loops is enough to make my very simplified example give identical results to -O2. But it's not working in isolation, ie -O2 -fsplit-loops does not (yet) expose the problem. It's likely some weird interaction between optimization passes... I still think that switching to plain -O2 is the best action right now.

perrotta · 2022-07-19T17:03:52Z

+1

Merge on top of the supposedly already produced CMSSW_12_5_X_2022-07-19-"afternoon" IB, so that it can be compared with the 2300 IB for a quick validation, as suggested by @civanch

perrotta · 2022-07-19T17:12:37Z

@aandvalenzuela

aandvalenzuela · 2022-07-19T17:23:12Z

@perrotta Can you see the build in the IB page? Is this what you needed @civanch?

perrotta · 2022-07-19T18:02:19Z

@aandvalenzuela now I see it, thank you: https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/ib/CMSSW_12_5_X

For some reason, there are two PRs that wern't picked in that IB, even if they were both merged before the ORP meeting:

However

#38775 includes some protection against corrupted CSC TPs when CCLUT data
#38777 removes some unneeded fwk classes

Therefore, I think that next CMSSW_12_5_X_2022-07-19-2300 plus this PR and the two above can be safely compared with CMSSW_12_5_G4VECGEOM_X_2022-07-19-1800 for what simulation is concerned

Let merge this one, then!

perrotta · 2022-07-19T18:02:32Z

merge

perrotta · 2022-07-19T18:09:19Z

Ah, wait: but that IB was in the VECGEOM stream....
Not clear how CMSSW_12_5_G4VECGEOM_X_2022-07-19-1800 compares with CMSSW_12_5_X_2022-07-19-1100.

Uhm... now this is merged. Maybe we will revert this one after 2300 IB, and tomorrow morning 1100 IB will be the one with "-O3" to compare with. I cannot follow any more by this evening, sorry

civanch · 2022-07-19T19:45:33Z

@perrotta , @bsunanda, CMSSW_12_5_G4VECGEOM_X_2022-07-19-1800 cannot be used for this purpose, because it is build on top of new Geant4, DD4hep, and VecGeom. I think, we should compare CMSSW_12_5_X_2022-07-19-2300 versus CMSSW_12_5_X_2022-07-19-1100

perrotta · 2022-07-20T04:46:02Z

@perrotta , @bsunanda, CMSSW_12_5_G4VECGEOM_X_2022-07-19-1800 cannot be used for this purpose, because it is build on top of new Geant4, DD4hep, and VecGeom. I think, we should compare CMSSW_12_5_X_2022-07-19-2300 versus CMSSW_12_5_X_2022-07-19-1100

Indeed... Sorry, there was some confusion yesterday evening, and then I could not follow any more.
I have created a revert of this PR in #7998 that can be merged for CMSSW_12_5_X_2022-07-20-1100. Therefore the validation can compare:

CMSSW_12_5_X_2022-07-19-2300 with CXX_FLAGS_RELEASE="-O2 -DNDEBUG"
CMSSW_12_5_X_2022-07-20-1100 without that flag set, i.e. "-O3"

As soon as the 1100 IB will become available, we will re-revert the PR, in order to have the "non-aggressive" compiler flag for the remaining of the 12_5_X cycle.

hahnjo · 2022-12-07T14:11:35Z

FWIW the problem resurfaced in #8211 with the next version 1.2.1 of VecGeom and gcc11. After some investigations, I opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008, let's see what the GCC people think.

[vecgeom] Disable aggressive compiler optimizations

c088ce8

as suggested by @hahnjo #7987 (comment)

cmsbuild added externals-pending orp-pending pending-signatures tests-pending labels Jul 11, 2022

smuzaffar mentioned this pull request Jul 11, 2022

Disable aggressive compiler optimizations for VecGeom #7987

Merged

cmsbuild added tests-started generators-pending and removed tests-pending labels Jul 11, 2022

cms-sw deleted a comment from cmsbuild Jul 11, 2022

cmsbuild removed the generators-pending label Jul 11, 2022

cmsbuild added the simulation-pending label Jul 11, 2022

cmsbuild added tests-approved and removed tests-started labels Jul 11, 2022

cmsbuild added tests-started and removed tests-approved labels Jul 12, 2022

cmsbuild added tests-rejected and removed tests-started labels Jul 12, 2022

makortel mentioned this pull request Jul 19, 2022

History of VecGeom validation in gcc10 cms-sw/cmssw#38789

Closed

cmsbuild added orp-approved and removed orp-pending labels Jul 19, 2022

cmsbuild merged commit 848749c into IB/CMSSW_12_5_X/master Jul 19, 2022

civanch mentioned this pull request Jul 19, 2022

[12_4_X][vecgeom] Disable aggressive compiler optimizations for VecGeom #7997

Merged

perrotta mentioned this pull request Jul 20, 2022

Revert "[vecgeom] Disable aggressive compiler optimizations" #7998

Merged

smuzaffar deleted the smuzaffar-patch-5 branch August 22, 2022 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vecgeom] Disable aggressive compiler optimizations #7992

[vecgeom] Disable aggressive compiler optimizations #7992

smuzaffar commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

smuzaffar commented Jul 11, 2022

smuzaffar commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

smuzaffar commented Jul 12, 2022

cmsbuild commented Jul 12, 2022

smuzaffar commented Jul 12, 2022

civanch commented Jul 12, 2022

civanch commented Jul 13, 2022

hahnjo commented Jul 13, 2022

fwyzard commented Jul 13, 2022

hahnjo commented Jul 13, 2022

hahnjo commented Jul 13, 2022

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

aandvalenzuela commented Jul 19, 2022 •

edited

Loading

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

civanch commented Jul 19, 2022

perrotta commented Jul 20, 2022

hahnjo commented Dec 7, 2022 •

edited

Loading

[vecgeom] Disable aggressive compiler optimizations #7992

[vecgeom] Disable aggressive compiler optimizations #7992

Conversation

smuzaffar commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

smuzaffar commented Jul 11, 2022

smuzaffar commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

cmsbuild commented Jul 11, 2022

Comparison Summary

smuzaffar commented Jul 12, 2022

cmsbuild commented Jul 12, 2022

Unit Tests

Comparison Summary

smuzaffar commented Jul 12, 2022

civanch commented Jul 12, 2022

civanch commented Jul 13, 2022

hahnjo commented Jul 13, 2022

fwyzard commented Jul 13, 2022

hahnjo commented Jul 13, 2022

hahnjo commented Jul 13, 2022

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

aandvalenzuela commented Jul 19, 2022 • edited Loading

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

perrotta commented Jul 19, 2022

civanch commented Jul 19, 2022

perrotta commented Jul 20, 2022

hahnjo commented Dec 7, 2022 • edited Loading

aandvalenzuela commented Jul 19, 2022 •

edited

Loading

hahnjo commented Dec 7, 2022 •

edited

Loading