Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Xrootd thread-safety issue in quality metric factory #13085

Merged
merged 8 commits into from
Jan 31, 2016

Conversation

bbockelm
Copy link
Contributor

The quality metric factory did not safely access its m_sources map. This never caused problems before as this is only done on file open and we didn't have many concurrent file opens.

Fixes (I hope) issue #13081.

As a bonus, I did a small commit that handles memory management for the response parameter to the HandleResponse virtual function in the same manner as the upstream XrdCl. That is, I delete for callbacks where they delete; I leave response untouched for operations where they leave it untouched.

Laboriously emulate from XrdCl the cases where we own the "response"
pointer and those where we do not.

I suspect this may have been leaking a few bytes per file open -
not going to kill the memory budget, but likely good for those using
valgrind!
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @bbockelm (Brian Bockelman) for CMSSW_8_0_X.

It involves the following packages:

Utilities/XrdAdaptor

@cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @wddgit this is something you requested to watch as well.
@slava77, @Degano, @smuzaffar you are the release manager for this.

cms-bot commands are list here #13028

@Dr15Jones
Copy link
Contributor

Please test

@cmsbuild
Copy link
Contributor

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/10775/console

@@ -66,7 +66,7 @@ friend class Source;

static QualityMetricFactory *m_instance;

typedef std::unordered_map<std::string, QualityMetricUniqueSource*> MetricMap;
typedef tbb::concurrent_hash_map<std::string, QualityMetricUniqueSource*> MetricMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use tbb::concurrent_unordered_map<std::string, std::shared_ptr<QualityMetricUniqueSource>> as the type. That type is guaranteed to take no visible locks. std::concurrent_hash_map is preferred if one need concurrent erasure, which is not the case here.

https://www.threadingbuildingblocks.org/docs/help/reference/containers_overview/concurrent_unordered_map_cls.htm

@cmsbuild
Copy link
Contributor

@@ -157,16 +157,17 @@ QualityMetricFactory * QualityMetricFactory::m_instance = new QualityMetricFacto
std::unique_ptr<QualityMetricSource>
QualityMetricFactory::get(timespec now, const std::string &id)
{
MetricMap::const_iterator it = m_instance->m_sources.find(id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we switched to concurrent_unordered_map than I believe the safe way to do this is

auto itFound = m_instance->m_sources.find(id);
if( itFound == m_instance->m_sources.end() ) {
   //try to make a new one 
  auto source = std::make_unique<QualityMetricUniqueSource>(now);
  auto insertResult = m_instance->m_sources.insert(std::make_pair(id, source.get());
  itFound = insertResult.first;
  if(insertResult.second) {
    //this thread successfully inserted
   source.release();
  }
}
return itFound->second->newSource(now);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I struggled to find an explanation of the differences and performance considerations between concurrent_unordered_map and concurrent_hash_map. Semantically, either should work in this case. What's the advantage of the above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind - just saw your comment above.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Pull request #13085 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please check and sign again.

@cmsbuild
Copy link
Contributor

Pull request #13085 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please check and sign again.

@bbockelm
Copy link
Contributor Author

@davidlange6 - sorry, there were some issues from developing both Xrootd patches in parallel. Merges fine now.

@Dr15Jones
Copy link
Contributor

please test

@Dr15Jones
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/10854/console

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_0_X IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar

@cmsbuild
Copy link
Contributor

-1

Tested at: 556e64e
I found errors in the following unit tests:

---> test testRecoMETMETProducers had ERRORS
---> test DetectorDescriptionRegressionTestDOMCount had ERRORS
---> test runtestSimCalorimetryHGCalSimProducers had ERRORS
---> test runtestRecoLocalCaloHGCalRecProducers had ERRORS

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-13085/10854/summary.html

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
556e64e
44f1aa1
93fb804
d120b1d
81a912b
471f6c6
5bc0097
dd3d2c4
4d57087
227e959
a11d0ac
40bea38
c057e65
3e850ff
840efb4
c0ebb2a
You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-13085/10854/git-log-recent-commits
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-13085/10854/git-merge-result

@cmsbuild
Copy link
Contributor

davidlange6 added a commit that referenced this pull request Jan 31, 2016
Fix Xrootd thread-safety issue in quality metric factory
@davidlange6 davidlange6 merged commit 3c25366 into cms-sw:CMSSW_8_0_X Jan 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants