Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LowPtElectrons: final energy regression and ID (back port of 32391) #33589

Merged

Conversation

bainbrid
Copy link
Contributor

@bainbrid bainbrid commented Apr 30, 2021

PR description:

This PR:

  • Provides support for UL re-MINIAOD of both "standard" and BParking data sets, using the modifiers bParking, run2_miniAOD_UL, and run2_miniAOD_devel.

  • Leaves the RECO sequence, and RECO/AOD content, unchanged w.r.t. 10_6_25.

  • Enables energy regression for certain modifier configurations (detailed below). The energy regression performance can be found on slides 6-7 here.

  • Enables new electron ID models for certain modifier configurations (detailed below). The performance of the ID models 2020Nov28 and 2021May17 can be found on slides 4-5 here.

  • Depends on PR LowPtElectrons: 2020Nov28 and 2021May17 models for 10_6_X cms-data/RecoEgamma-ElectronIdentification#19

A detailed summary of the modifier configurations provided in this PR can be found at this TWiki (see table). A brief summary is provided below.

  1. No changes to the RECO sequence are applied.

  2. No-change policy for UL re-MINIAOD of standard data sets, modifier (~bParking & run2_miniAOD_UL):

  • the functionality provided by this logic is preserved w.r.t. release CMSSW_10_6_20, which was used as part of the Winter 2020 UL re-MINIAOD campaign for standard data sets. (No energy regression is performed; the 2020Sept15 ID model is evaluated.)
  1. Changes to PAT for any future (re)MINIAOD campaign for standard data sets, modifier (~bParking & run2_miniAOD_devel):
  • Energy regression is performed as part of the re-MINIAOD sequence.

  • ID is evaluated as part of the re-MINIAOD sequence. A new ID weights file (2020Nov28) has been provided for use with a UL re-MINIAOD campaign. There is no appreciable extra CPU load (w.r.t. the original model), and the memory footprint will reduce (due to the smaller weights file). The 2020Nov28 model is available within PR LowPtElectrons: 2020Nov28 and 2021May17 models for 10_6_X cms-data/RecoEgamma-ElectronIdentification#19.

  • Low-pT PAT electrons are selected based on the following conditions pT > 1.0 GeV and ID score > -0.25, which provides 90% signal efficiency, a 10% mistag rate, resulting in an increase to the MINIAOD footprint of ~1%.

  1. Changes to PAT for any future (re)MINIAOD campaign for the BParking data set, modifier (bParking & run2_miniAOD_UL):
  • Energy regression is performed as part of the re-MINIAOD sequence.

  • ID is evaluated as part of the re-MINIAOD sequence. A new ID weights file (2021May17), has been provided for use with a UL re-MINIAOD campaign. There is no appreciable extra CPU load (w.r.t. the original model), and the memory footprint will reduce (due to the smaller weights file). The 2021May17 model is available within PR LowPtElectrons: 2020Nov28 and 2021May17 models for 10_6_X cms-data/RecoEgamma-ElectronIdentification#19.

  • Low-pT PAT electrons are selected based on the following conditions pT > 1.0 GeV. Computing performance studies can be found below. The raising of the minimum electron pT threshold from 'open' (effectively ~0.5 GeV) to 1.0 GeV reduces the MINIAOD event size by ~20%, based on the 136.898 workflow (data ParkingBPH 2018B).

PR validation:

  • Local checks performed for both the RECO and PAT steps, for all relevant modifier configurations
  • The following workflows were considered: 136.88811, 1325.518, 136.898
  • the 2020Nov28 and 2021May17 model performances in CMSSW were validated against our standalone test bench setup.

if this PR is a backport please specify the original PR and why you need to backport that PR:

This PR is a back port of #32391

Forward port of additional functionality

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @bainbrid for CMSSW_10_6_X.

It involves the following packages:

PhysicsTools/PatAlgos
RecoEgamma/Configuration
RecoEgamma/EgammaElectronProducers

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@emilbols, @gouskos, @jainshilpi, @hatakeyamak, @rappoccio, @mbluj, @varuns23, @seemasharmafnal, @mmarionncern, @ahinzmann, @lgray, @jdolen, @ferencek, @Sam-Harper, @rovere, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @mariadalfonso, @clelange, @swozniewski, @JyothsnaKomaragiri, @sobhatta, @lecriste, @afiqaize, @gpetruc, @andrzejnovak, @ram1123 this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@bainbrid
Copy link
Contributor Author

@bainbrid
Copy link
Contributor Author

@slava77 do I need to squash the commits?

@slava77
Copy link
Contributor

slava77 commented Apr 30, 2021

@slava77 do I need to squash the commits?

It's welcome but not required.

@slava77
Copy link
Contributor

slava77 commented Apr 30, 2021

test parameters:

@slava77
Copy link
Contributor

slava77 commented Apr 30, 2021

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5690ec/14752/summary.html
COMMIT: 4911c44
CMSSW: CMSSW_10_6_X_2021-04-30-1100/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/33589/14752/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 102 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215544
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215209
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor

slava77 commented May 3, 2021

Reco comparison results: 102 differences found in the comparisons

all differences appear to be in 136.88811 (2018 data UL re-miniAOD). Changes in this workflow are not allowed/expected in this release. It looks like something is not protected by _devel or bParking

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2021

Pull request #33589 was updated. @perrotta, @jpata, @cmsbuild, @slava77 can you please check and sign again.

@bainbrid
Copy link
Contributor Author

bainbrid commented May 4, 2021

Reco comparison results: 102 differences found in the comparisons

all differences appear to be in 136.88811 (2018 data UL re-miniAOD). Changes in this workflow are not allowed/expected in this release. It looks like something is not protected by _devel or bParking

Sorry, there was a typo.

Fixed here 683bf86.

Tested locally with 136.88811 (and can reproduce exactly the ID distribution now...)

@slava77
Copy link
Contributor

slava77 commented May 4, 2021

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5690ec/14843/summary.html
COMMIT: 683bf86
CMSSW: CMSSW_10_6_X_2021-05-04-1100/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/33589/14843/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215556
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215220
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

Copy link
Contributor

@slava77 slava77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are quite a few divergent points from what's in #32391 / master version. Backports should be maximally verbatim.

Comment on lines 80 to 81
# For run2_miniAOD_devel ...
# (1) apply energy regression
# (2) rekey seed BDT ValueMaps by reco::GsfElectron
# (3) rerun ID
from Configuration.Eras.Modifier_run2_miniAOD_devel_cff import run2_miniAOD_devel
from RecoEgamma.EgammaElectronProducers.lowPtGsfElectrons_cff import lowPtGsfElectrons
from RecoEgamma.EgammaElectronProducers.lowPtGsfElectronSeedValueMaps_cff import rekeyLowPtGsfElectronSeedValueMaps
from RecoEgamma.EgammaElectronProducers.lowPtGsfElectronID_cff import lowPtGsfElectronID
_makePatLowPtElectronsTask = makePatLowPtElectronsTask.copy()
_makePatLowPtElectronsTask.add(lowPtGsfElectrons)
_makePatLowPtElectronsTask.add(rekeyLowPtGsfElectronSeedValueMaps)
_makePatLowPtElectronsTask.add(lowPtGsfElectronID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified to be verbatim w.r.t. master, caveat changes to handle modifiers (and their combinations) according to the TWiki linked in the PR description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I've reverted back to the order originally used in this PR, as fitst more naturally with the existing code in 10_6_X and also the most recent changes I've made to deal with issues found during testing. I can copy this style to master as part of an upcoming PR (to master) that includes missing items / "forward ports".

Comment on lines 17 to 19
# Modifier for run2_miniAOD_UL (order is important: overrides bParking)
from Configuration.ProcessModifiers.run2_miniAOD_UL_cff import run2_miniAOD_UL
run2_miniAOD_UL.toModify(selectedPatLowPtElectrons,cut = "pt>1. && electronID('ID')>1.5")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this part really needed: it seems more straightforward to keep the older version (the cut in L9/10) to stay unchanged.

Suggested change
# Modifier for run2_miniAOD_UL (order is important: overrides bParking)
from Configuration.ProcessModifiers.run2_miniAOD_UL_cff import run2_miniAOD_UL
run2_miniAOD_UL.toModify(selectedPatLowPtElectrons,cut = "pt>1. && electronID('ID')>1.5")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, reverted as suggested.

@@ -3,16 +3,25 @@
# module to select Electrons
# See https://twiki.cern.ch/twiki/bin/view/CMS/SWGuidePhysicsCutParser
# on how to use the cut-string
#

# By default, no low-pT electrons are stored in MINIAOD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment actually useful?
is it supposed to really refer to what's stored, or more about the module running (or somehow to the fact that the cut is now changed to "" in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment removed. (Was related to the cut that was changed to "", now also reverted.)

Comment on lines 125 to 128
(bParking | run2_miniAOD_UL | run2_miniAOD_devel).toModify(
MicroEventContent,
outputCommands = MicroEventContent.outputCommands + _lowPt_extraCommands
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in this file seem unnecessary.
run2_miniAOD_devel will highly unlikely be applied by itself without either bParking or run2_miniAOD_UL already present.

please remove the commit editing this file if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'run2_miniAOD_devel' is required, as used in the following combination: (~bParking & run2_miniAOD_devel)
This modifier combination is defined in the TWiki linked in the PR description, which is "Recommended for future UL re-miniAOD of non-BParking data sets."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'run2_miniAOD_devel' is required, as used in the following combination: (~bParking & run2_miniAOD_devel)
This modifier combination is defined in the TWiki linked in the PR description, which is "Recommended for future UL re-miniAOD of non-BParking data sets."

but the point is that run2_miniAOD_devel is needed only where one has to extend the functionality beyond what is already provided by run2_miniAOD_UL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 185 to 191
phase2_hgcal.toModify( RecoEgammaFEVT, outputCommands = RecoEgammaFEVT.outputCommands + _phase2_hgcal_RecoEgamma_tokeep
)
phase2_hgcal.toModify( RecoEgammaFEVT, outputCommands = RecoEgammaFEVT.outputCommands + _phase2_hgcal_RecoEgamma_tokeep )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice(r) not to edit unrelated lines in a backport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, reverted.

)

from Configuration.Eras.Modifier_run2_miniAOD_devel_cff import run2_miniAOD_devel
run2_miniAOD_devel.toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
run2_miniAOD_devel.toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)
(bParking | run2_miniAOD_devel).toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bParking.toModify(_lowPtGsfElectrons,
previousGsfElectronsTag = cms.InputTag("lowPtGsfElectronsPreRegression"),
)
bParking.toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bParking.toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)

see above (bParking | run2_miniAOD_devel).toReplaceWith

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 124 to 52
bParking.toModify(_lowPtGsfElectrons,
previousGsfElectronsTag = cms.InputTag("lowPtGsfElectronsPreRegression"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bParking.toModify(_lowPtGsfElectrons,
previousGsfElectronsTag = cms.InputTag("lowPtGsfElectronsPreRegression"),
)
bParking.toModify(_lowPtGsfElectrons,
previousGsfElectronsTag = "lowPtGsfElectronsPreRegression"
)

also, please move this up, before _lowPtGsfElectrons is used in toReplaceWith calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

run2_miniAOD_devel.toReplaceWith(lowPtGsfElectrons,_lowPtGsfElectrons)

from Configuration.Eras.Modifier_bParking_cff import bParking
lowPtGsfElectronsPreRegression = lowPtGsfElectrons.clone()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line better be moved before all the .toReplaceWith(lowPtGsfElectrons calls above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I need this line anymore?

Please can you check the updated version of this file to be sure it's organised correctly according to your suggestion.

@@ -68,3 +68,61 @@
from Configuration.Eras.Modifier_pp_on_AA_2018_cff import pp_on_AA_2018
pp_on_AA_2018.toModify(lowPtGsfElectrons.preselection, minSCEtBarrel = 15.0)
pp_on_AA_2018.toModify(lowPtGsfElectrons.preselection, minSCEtEndcaps = 15.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to propose a less circular implementation with more similarity to what is done in the master version:

  • define lowPtGsfElectronsPreRegression_cfi.py which would contain all of the above except for the module name changed to lowPtGsfElectronsPreRegression
  • keep the rest of this file content in lowPtGsfElectrons_cfi.py
    • start with lowPtGsfElectrons = lowPtGsfElectronsPreRegression.clone()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I also updated these lines to read as 'lowPtGsfElectronsPreRegression.preselection' - I hope this is correct.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 5, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5690ec/15677/summary.html
COMMIT: eb4bdf5
CMSSW: CMSSW_10_6_X_2021-06-04-2300/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/33589/15677/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215543
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215207
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

Co-authored-by: Slava Krutelyov <slava77@gmail.com>
@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 5, 2021

Pull request #33589 was updated. @perrotta, @jpata, @cmsbuild, @slava77 can you please check and sign again.

@bainbrid
Copy link
Contributor Author

bainbrid commented Jun 5, 2021

@slava77 in response to this:

I've just reviewed again all combinations of modifier logic required in 10_6_20 and this PR, and I have this:

None					Autumn18
bParking (only)				Autumn18
(bParking & ~run2_miniAOD_UL)		Autumn18
(~bParking & run2_miniAOD_UL)		2020Sept15
(~bParking & run2_miniAOD_devel) 	2020Nov28
(bParking & run2_miniAOD_UL) 		2021May17

So I believe your change (in conjunction with the remaining cfi) does indeed give the above.

The original logic (bParking | run2_miniAOD_UL) prior to this commit 2a63f28 used the 2020Sept15 model, but the bParking logic (alone) is supposed to use the obsolete Autumn18 model, as we are no longer changing RECO. Your change (should!) fix this.

Explicitly we could write in the cfi:

default -->				Autumn18
(~bParking & run2_miniAOD_UL)	-->	2020Sept15
(~bParking & run2_miniAOD_devel) -->	2020Nov28
(bParking & run2_miniAOD_UL) -->	2021May17

but the existing logic is effectively the same.

@slava77
Copy link
Contributor

slava77 commented Jun 5, 2021

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 5, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-5690ec/15680/summary.html
COMMIT: 2a63f28
CMSSW: CMSSW_10_6_X_2021-06-05-1100/slc7_amd64_gcc700
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/33589/15680/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 3215543
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3215207
  • DQMHistoTests: Total skipped: 334
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 143 log files, 29 edm output root files, 35 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor

slava77 commented Jun 6, 2021

+reconstruction

for #33589 2a63f28

  • code changes are in line with the PR description and the follow up review
  • jenkins tests pass and comparisons with the baseline show no (relevant) differences
  • local tests with run2_miniAOD_UL running on reco made in 10_6_25 appear to work OK

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 6, 2021

This pull request is fully signed and it will be integrated in one of the next CMSSW_10_6_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_0_X is complete. This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Jun 7, 2021

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants