Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serialization] Do less redundant work computing affecting module maps #66933

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

sam-mccall
Copy link
Collaborator

We're traversing the same chains of module ancestors and include
locations repeatedly, despite already populating sets that can detect it!

This is a problem because translateFile() is expensive. I think we can
avoid it entirely, but this seems like an improvement either way.

I removed a callback indirection rather than giving it a more complicated
signature, and accordingly renamed the lambdas to be more concrete.

We're traversing the same chains of module ancestors and include
locations repeatedly, despite already populating sets that can detect it!

This is a problem because translateFile() is expensive. I think we can
avoid it entirely, but this seems like an improvement either way.

I removed a callback indirection rather than giving it a more complicated
signature, and accordingly renamed the lambdas to be more concrete.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Sep 20, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 20, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Changes

We're traversing the same chains of module ancestors and include
locations repeatedly, despite already populating sets that can detect it!

This is a problem because translateFile() is expensive. I think we can
avoid it entirely, but this seems like an improvement either way.

I removed a callback indirection rather than giving it a more complicated
signature, and accordingly renamed the lambdas to be more concrete.


Full diff: https://github.com/llvm/llvm-project/pull/66933.diff

1 Files Affected:

  • (modified) clang/lib/Serialization/ASTWriter.cpp (+24-23)
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 65bee806d2c5571..216ca94111e156b 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -163,8 +163,6 @@ namespace {
 
 std::set<const FileEntry *> GetAffectingModuleMaps(const Preprocessor &PP,
                                                    Module *RootModule) {
-  std::set<const FileEntry *> ModuleMaps{};
-  std::set<const Module *> ProcessedModules;
   SmallVector<const Module *> ModulesToProcess{RootModule};
 
   const HeaderSearch &HS = PP.getHeaderSearchInfo();
@@ -195,42 +193,45 @@ std::set<const FileEntry *> GetAffectingModuleMaps(const Preprocessor &PP,
   const ModuleMap &MM = HS.getModuleMap();
   SourceManager &SourceMgr = PP.getSourceManager();
 
-  auto ForIncludeChain = [&](FileEntryRef F,
-                             llvm::function_ref<void(FileEntryRef)> CB) {
-    CB(F);
+  std::set<const FileEntry *> ModuleMaps{};
+  auto CollectIncludingModuleMaps = [&](FileEntryRef F) {
+    if (!ModuleMaps.insert(F).second)
+      return;
     FileID FID = SourceMgr.translateFile(F);
     SourceLocation Loc = SourceMgr.getIncludeLoc(FID);
     // The include location of inferred module maps can point into the header
     // file that triggered the inferring. Cut off the walk if that's the case.
     while (Loc.isValid() && isModuleMap(SourceMgr.getFileCharacteristic(Loc))) {
       FID = SourceMgr.getFileID(Loc);
-      CB(*SourceMgr.getFileEntryRefForID(FID));
+      if (!ModuleMaps.insert(*SourceMgr.getFileEntryRefForID(FID)).second)
+        break;
       Loc = SourceMgr.getIncludeLoc(FID);
     }
   };
 
-  auto ProcessModuleOnce = [&](const Module *M) {
-    for (const Module *Mod = M; Mod; Mod = Mod->Parent)
-      if (ProcessedModules.insert(Mod).second) {
-        auto Insert = [&](FileEntryRef F) { ModuleMaps.insert(F); };
-        // The containing module map is affecting, because it's being pointed
-        // into by Module::DefinitionLoc.
-        if (auto ModuleMapFile = MM.getContainingModuleMapFile(Mod))
-          ForIncludeChain(*ModuleMapFile, Insert);
-        // For inferred modules, the module map that allowed inferring is not in
-        // the include chain of the virtual containing module map file. It did
-        // affect the compilation, though.
-        if (auto ModuleMapFile = MM.getModuleMapFileForUniquing(Mod))
-          ForIncludeChain(*ModuleMapFile, Insert);
-      }
+  std::set<const Module *> ProcessedModules;
+  auto CollectIncludingMapsFromAncestors = [&](const Module *M) {
+    for (const Module *Mod = M; Mod; Mod = Mod->Parent) {
+      if (!ProcessedModules.insert(Mod).second)
+        break;
+      // The containing module map is affecting, because it's being pointed
+      // into by Module::DefinitionLoc.
+      if (auto ModuleMapFile = MM.getContainingModuleMapFile(Mod))
+        CollectIncludingModuleMaps(*ModuleMapFile);
+      // For inferred modules, the module map that allowed inferring is not in
+      // the include chain of the virtual containing module map file. It did
+      // affect the compilation, though.
+      if (auto ModuleMapFile = MM.getModuleMapFileForUniquing(Mod))
+        CollectIncludingModuleMaps(*ModuleMapFile);
+    }
   };
 
   for (const Module *CurrentModule : ModulesToProcess) {
-    ProcessModuleOnce(CurrentModule);
+    CollectIncludingMapsFromAncestors(CurrentModule);
     for (const Module *ImportedModule : CurrentModule->Imports)
-      ProcessModuleOnce(ImportedModule);
+      CollectIncludingMapsFromAncestors(ImportedModule);
     for (const Module *UndeclaredModule : CurrentModule->UndeclaredUses)
-      ProcessModuleOnce(UndeclaredModule);
+      CollectIncludingMapsFromAncestors(UndeclaredModule);
   }
 
   return ModuleMaps;

Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@sam-mccall sam-mccall merged commit 0f05096 into llvm:main Sep 22, 2023
jansvoboda11 pushed a commit to swiftlang/llvm-project that referenced this pull request Mar 28, 2024
llvm#66933)

We're traversing the same chains of module ancestors and include
locations repeatedly, despite already populating sets that can detect
it!

This is a problem because translateFile() is expensive. I think we can
avoid it entirely, but this seems like an improvement either way.

I removed a callback indirection rather than giving it a more
complicated
signature, and accordingly renamed the lambdas to be more concrete.

(cherry picked from commit 0f05096)
jansvoboda11 pushed a commit to swiftlang/llvm-project that referenced this pull request Mar 28, 2024
llvm#66933)

We're traversing the same chains of module ancestors and include
locations repeatedly, despite already populating sets that can detect
it!

This is a problem because translateFile() is expensive. I think we can
avoid it entirely, but this seems like an improvement either way.

I removed a callback indirection rather than giving it a more
complicated
signature, and accordingly renamed the lambdas to be more concrete.

(cherry picked from commit 0f05096)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants