-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random DQM differences in PR tests #31644
Comments
assign dqm |
New categories assigned: dqm @jfernan2,@andrius-k,@fioriNTU,@kmaeshima,@ErnestaP you have been requested to review this Pull request/Issue and eventually sign? Thanks |
A new Issue was created by @makortel Matti Kortelainen. @Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
Hi @makortel I don't see the differences in the exact PR numbers you quote but I have seen them randomly around for other PRs I have reviewed, specially the SiOuterTracker differences. I have ignored them since they seem to me our famous stat box inconsistency: the top most stat box is from ROOT, and shows a rounded value, the other ones are from the GUI and show the true value from the data. So, it looks to me like a precision problem, indeed if you plot on side instead of overlay, you get the same histo, e.g.: And also because RelMon does NOT spot the difference, just the Bin By Bin tool e.g.: |
In the PRs I quoted these differences are shown in #31546 (comment) and #31642 (comment), and I do see them flagged by RelMon. I agree it looks some kind of precision issue.
The stat box indeed looks the same, but the error bars look different, so maybe something tiny is indeed different between the two? Anyway, I consider this issue more of documenting "known" failures than something needing immediate fix (even if latter would be nice). |
Here is one more #29553 (comment) |
|
Here is one more ( |
#31815 (comment) shows tiny differences in
|
#31858 (comment) shows tiny differences in
|
The same occurs in #31861 (comment) |
@emacdonald16, FYI |
Weird. I'll look into it. It does look like a precision issue... |
#31930 (comment) shows tiny differences in
|
cms-sw/cms-bot#1409 (comment) shows tiny differences in
|
#32217 (comment) shows tiny differences in
|
#32244 (comment) shows tiny differences in
|
#32184 (comment) shows tiny differences in
|
#29553 (comment) shows tiny differences in
|
#32487 (comment) shows tiny differences in
|
#32499 (comment) shows tiny differences in
|
#32510 (comment) shows tiny differences in
|
#32622 (comment) shows tiny differences in
|
#32782 (comment) shows tiny differences in
|
#32947 (comment) shows tiny differences in
|
#33038 (comment) shows tiny differences in
|
@benkrikler @dinyar could you lease have a look at the L1TEMU/L1TdeStage2uGMT/data_vs_emulator_comparison/{mismatchRatio,errorSummaryNum} diffeences? Thanks |
Hi @jfernan2, I'm not sure what is being compared (i.e. what the base comparison is), but my recent PR fixed the incorrect conversion from the hardware code of the unconstrained pT to the physical value, so the unconstrained pT is expected to have changed now. Cheers, |
@dinyar In #33038 (comment) the comparison was IB+PR vs IB in CMSSW_11_3_X_2021-03-02-1100. As far as I can tell no other PRs were included on top of the IB in that test. |
@makortel ok, thanks for the information. I'm a bit mystified by this to be honest I'll try to reproduce locally. |
#33106 (comment) shows tiny differences in
|
Hi again, I ran locally but couldn't reproduce the difference between the commits in question. I did notice that the mismatches are between EMTF muons where in the unpacked uGMT output we see unconstrained pT > 0, however in the emulated uGMT output their upt == 0. I suspect the reason for the data-emu mismatches is that the RegionalMuon packer/unpacker doesn't expect the EMTF to have displaced quantities yet, in this case I have to fix that. @eyigitba should confirm though whether we're even supposed to see displaced quantities yet. As to why we see a difference between base and base+PR in #33038 and #33106 I'm a bit lost. As mentioned above I can't reproduce it locally and it seems to go both ways: in #33038 the mismatches appeared after applying the patch, in #33106 they seem to have gone away again (but in a different workflow.. ). Can there be other moving pieces in the background like the PU sample or something like it? Cheers, |
Hi @dinyar , we don't have the displaced muons in the firmware right now. So you shouldn't see any unpacked displaced quantity from EMTF. I don't know why the unpacked uGMT output has non-zero unconstrained pT for EMTF muons. |
Right, sorry I wasn't clear. Here "unpacked" still refers to MC, so if the EMTF emulator creates displaced information that would qualify if I'm not forgetting something. |
I see ok, my bad. I thought on the emulator side you didn't need any changes to uGMT, but maybe on the unpacker side you still need to (although we don't have a packer for the emulator so I don't know if this applies here). Anyway I don't know what the difference would be between these commits. We didn't change anything on EMTF side. |
Ok, yeah I thought so. I now noticed there was some update/change of TensorFlow a few commits ago which to my knowledge you use.. could this change the upt values in some way? |
Maybe it can. I don't really know. I need to check on my end. |
Actually can you point me to the commit about TensorFlow? |
Sure, I saw mention of it in 502d254. |
I don't think this should change anything. I didn't do any checks in the last month or so though. I'll confirm soon if everything works as expected. |
@dinyar Because the tiny differences are visible in tests of unrelated PRs, they likely have some random origin. The cause could be e.g. an uninitialized variable somewhere. The tests are run such that all other conditions except chances in the PR should be same (some times already merged PRs are included). If anything in the event content would change, we should see many other differences as well. I'm partly reporting any occurrence I see to get a picture how frequent these random differences are. |
+1 |
This issue is fully signed and ready to be closed. |
I've seen tiny changes in
ParticleFlow/JetResponse/slimmedJetsPuppi/JEC/preso_eta30
SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
SiOuterTrackerV/Tracks/FinalResolution/VtxZResolution
in the tests of PRs #31546 and #31642 that should not have any impact on how those histograms are filled.
The text was updated successfully, but these errors were encountered: