Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial pass at controlling model instance and threads for fallback triton servers #41228

Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions HeterogeneousCore/SonicTriton/interface/TritonService.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

#include "FWCore/ParameterSet/interface/ParameterSet.h"
#include "FWCore/Utilities/interface/GlobalIdentifier.h"
#include "FWCore/ServiceRegistry/interface/SystemBounds.h"
#include "FWCore/ServiceRegistry/interface/ServiceMaker.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ServiceMaker.h include should be unnecessary, and for SystemBounds.h it would be better to forward-declare edm::service::SystemBounds below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed and forward declared


#include <vector>
#include <unordered_set>
Expand Down Expand Up @@ -113,6 +115,7 @@ class TritonService {
static void fillDescriptions(edm::ConfigurationDescriptions& descriptions);

private:
void preallocate(edm::service::SystemBounds const&);
void preModuleConstruction(edm::ModuleDescription const&);
void postModuleConstruction(edm::ModuleDescription const&);
void preModuleDestruction(edm::ModuleDescription const&);
Expand All @@ -128,12 +131,14 @@ class TritonService {
unsigned currentModuleId_;
bool allowAddModel_;
bool startedFallback_;
unsigned int numInstancesForFallbacks_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member appears to be unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it's been removed

std::string pid_;
std::unordered_map<std::string, Model> unservedModels_;
//this represents a many:many:many map
std::unordered_map<std::string, Server> servers_;
std::unordered_map<std::string, Model> models_;
std::unordered_map<unsigned, Module> modules_;
int numberOfThreads_;
};

#endif
1 change: 1 addition & 0 deletions HeterogeneousCore/SonicTriton/plugins/TritonService.cc
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#include "FWCore/ServiceRegistry/interface/ServiceMaker.h"
#include "FWCore/ServiceRegistry/interface/SystemBounds.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This #include should not be needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

#include "FWCore/Framework/interface/MakerMacros.h"
#include "HeterogeneousCore/SonicTriton/interface/TritonService.h"

Expand Down
1 change: 1 addition & 0 deletions HeterogeneousCore/SonicTriton/scripts/cmsTriton
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ start_singularity(){
for REPO in ${REPOS[@]}; do
MOUNTARGS="$MOUNTARGS -B $REPO"
REPOARGS="$REPOARGS --model-repository=${REPO}"
ls ${REPO}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line a leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, it's been removed now

done

# compatibility driver environment
Expand Down
71 changes: 70 additions & 1 deletion HeterogeneousCore/SonicTriton/src/TritonService.cc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
#include "FWCore/ParameterSet/interface/ConfigurationDescriptions.h"
#include "FWCore/ParameterSet/interface/ParameterSetDescription.h"
#include "FWCore/ServiceRegistry/interface/ActivityRegistry.h"
#include "FWCore/ServiceRegistry/interface/SystemBounds.h"
#include "FWCore/ServiceRegistry/interface/ServiceMaker.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ServiceMaker.h include should not be needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

#include "FWCore/ServiceRegistry/interface/ProcessContext.h"
#include "FWCore/Utilities/interface/Exception.h"

Expand Down Expand Up @@ -62,6 +64,9 @@ TritonService::TritonService(const edm::ParameterSet& pset, edm::ActivityRegistr
startedFallback_(false),
pid_(std::to_string(::getpid())) {
//module construction is assumed to be serial (correct at the time this code was written)

areg.watchPreallocate(this, &TritonService::preallocate);

areg.watchPreModuleConstruction(this, &TritonService::preModuleConstruction);
areg.watchPostModuleConstruction(this, &TritonService::postModuleConstruction);
areg.watchPreModuleDestruction(this, &TritonService::preModuleDestruction);
Expand Down Expand Up @@ -136,6 +141,10 @@ TritonService::TritonService(const edm::ParameterSet& pset, edm::ActivityRegistr
edm::LogInfo("TritonService") << msg;
}

void TritonService::preallocate(edm::service::SystemBounds const& bounds) {
numberOfThreads_ = bounds.maxNumberOfThreads();
}

void TritonService::preModuleConstruction(edm::ModuleDescription const& desc) {
currentModuleId_ = desc.id();
allowAddModel_ = true;
Expand Down Expand Up @@ -238,7 +247,67 @@ void TritonService::preBeginJob(edm::PathsAndConsumesOfModulesBase const&, edm::
if (fallbackOpts_.wait >= 0)
fallbackOpts_.command += " -w " + std::to_string(fallbackOpts_.wait);
for (const auto& [modelName, model] : unservedModels_) {
fallbackOpts_.command += " -m " + model.path;
//We need to edit the triton config files, need to copy from cvmfs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually suggest to handle the copying and editing of model config files in the cmsTriton script (by passing in the numberOfThreads value when the script is called). Two reasons:

  1. most operations here are shell commands, therefore more naturally suited to be implemented in bash
  2. allows this functionality (controlling number of instances per model) to be used outside of CMSSW/TritonService

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes good suggestion, thank you! Let me switch that over

std::string delimiter = "/";
std::string tmp_model_path = model.path;
size_t pos = 0;
std::string token;
std::string model_dir;
while ((pos = tmp_model_path.find(delimiter)) != std::string::npos) {
if (tmp_model_path.substr(0, pos) != "config.pbtxt") {
token = tmp_model_path.substr(0, pos);
model_dir += token + "/";
}
tmp_model_path.erase(0, pos + delimiter.length());
}

std::string remover = "mkdir local_model_repo; rm -rf local_model_repo/" + token;
std::string copier = "cp -r ";
const auto& [output, rv] = execSys(remover + "; " + copier + model_dir + " local_model_repo/");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use std::filesystem API instead of invoking the shell.

mkdir local_model_repo followed by removing something from the newly created directory looks suspicious.

If I understood correctly, the fallback server for each cmsRun process would need its own "local model repo", and that would call for a unique name. Could e.g. the fallbackOpts_.tempDir be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this has been moved into the cmsTriton script

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and the edits to the config files are made in the tempDir now

fallbackOpts_.command += " -m ./local_model_repo/" + token + "/config.pbtxt";

//Need to edit files differently depending on model framework
bool isONNX = false;
bool isTF = false;

size_t offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the offset is not really needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has all been deleted

std::string line;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this declaration inside the if block as it is needed only there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all deleted and moved to cmsTriton script

std::ifstream Myfile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable names should start with lower case character.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

Myfile.open("./local_model_repo/" + token + "/config.pbtxt");
if (Myfile.is_open()) {
while (!Myfile.eof()) {
getline(Myfile, line);
if ((offset = line.find("onnx", 0)) != std::string::npos) {
isONNX = true;
Myfile.close();
} else if ((offset = line.find("tensorflow", 0)) != std::string::npos) {
isTF = true;
Myfile.close();
}
}
Myfile.close();
}

//append instance and thread information to config files
std::ofstream out;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to explicitly close the out file. My personal preference would be to place all the code accessing out inside a block along

{
  std::ofstream out{"./local_model_repo/" + token + "/config.pbtxt", std::ios::app};
  // all code accessing out
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer done here. Now I just echo in the cmsTriton bash script

out.open("./local_model_repo/" + token + "/config.pbtxt", std::ios::app);
out << std::endl
<< "instance_group [" << std::endl
<< " {" << std::endl
<< " count: " << numberOfThreads_ << std::endl
<< " kind: KIND_CPU" << std::endl
<< " }" << std::endl
<< "]" << std::endl;

if (isONNX) {
out << "parameters { key: \"intra_op_thread_count\" value: { string_value: \"1\" } }" << std::endl;
out << "parameters { key: \"inter_op_thread_count\" value: { string_value: \"1\" } }" << std::endl;
}
if (isTF) {
out << "parameters { key: \"TF_NUM_INTRA_THREADS\" value: { string_value: \"1\" } }" << std::endl;
out << "parameters { key: \"TF_NUM_INTER_THREADS\" value: { string_value: \"1\" } }" << std::endl;
out << "parameters { key: \"TF_USE_PER_SESSION_THREADS\" value: { string_value: \"1\" } }" << std::endl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if neither isONNX or isTF is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is another subtlety with triton server configs: they are model-type dependent. Thread controls are only implemented right now for ONNX and TF models, so if neither of those options are true, then nothing else should be included in the config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least emit a warning in this case. something like: "Warning: thread (instance) control not implemented for $PLATFORM" (following one of my suggestions below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this, thanks!

}
if (!fallbackOpts_.imageName.empty())
fallbackOpts_.command += " -i " + fallbackOpts_.imageName;
Expand Down