Skip to content

Commit

Permalink
[analyzer][ctu] Fix wrong 'multiple definitions' errors caused by spa…
Browse files Browse the repository at this point in the history
…ce characters in lookup names when parsing the ctu index file

This error was found when analyzing MySQL with CTU enabled.

When there are space characters in the lookup name, the current
delimiter searching strategy will make the file path wrongly parsed.
And when two lookup names have the same prefix before their first space
characters, a 'multiple definitions' error will be wrongly reported.

e.g. The lookup names for the two lambda exprs in the test case are
`c:@s@G@F@G#@sa@F@operator int (*)(char)#1` and
`c:@s@G@F@G#@sa@F@operator bool (*)(char)#1` respectively. And their
prefixes are both `c:@s@G@F@G#@sa@F@operator` when using the first space
character as the delimiter.

Solving the problem by adding a length for the lookup name, making the
index items in the format of `<USR-Length>:<USR File> <Path>`.

---

In the test case of this patch, we found that it will trigger a "triple
mismatch" warning when using `clang -cc1` to analyze the source file
with CTU using the on-demand-parsing strategy in Darwin systems. And
this problem is also encountered in D75665, which is the patch
introducing the on-demand parsing strategy.
We temporarily bypass this problem by using the loading-ast-file
strategy.

Refer to the [discourse topic](https://discourse.llvm.org/t/60762) for
more details.

Differential Revision: https://reviews.llvm.org/D102669
  • Loading branch information
Snape3058 committed Mar 22, 2022
1 parent e609417 commit 9f90254
Show file tree
Hide file tree
Showing 12 changed files with 174 additions and 75 deletions.
8 changes: 4 additions & 4 deletions clang/docs/analyzer/user-docs/CrossTranslationUnit.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ of `foo.cpp`:
$
The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the
source files:
source files in format `<USR-Length>:<USR> <File-Path>`:

.. code-block:: bash
$ clang-extdef-mapping -p . foo.cpp
c:@F@foo# /path/to/your/project/foo.cpp
9:c:@F@foo# /path/to/your/project/foo.cpp
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
We have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files:
Expand Down Expand Up @@ -278,12 +278,12 @@ The `invocation list`:
We'd like to analyze `main.cpp` and discover the division by zero bug.
As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of
external definitions in the source files:
external definitions in the source files in format `<USR-Length>:<USR> <File-Path>`:

.. code-block:: bash
$ clang-extdef-mapping -p . foo.cpp
c:@F@foo# /path/to/your/project/foo.cpp
9:c:@F@foo# /path/to/your/project/foo.cpp
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
Now everything is available for the CTU analysis.
Expand Down
4 changes: 2 additions & 2 deletions clang/include/clang/Basic/DiagnosticCrossTUKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ def err_ctu_error_opening : Error<
"error opening '%0': required by the CrossTU functionality">;

def err_extdefmap_parsing : Error<
"error parsing index file: '%0' line: %1 'UniqueID filename' format "
"expected">;
"error parsing index file: '%0' line: %1 '<USR-Length>:<USR> <File-Path>' "
"format expected">;

def err_multiple_def_index : Error<
"multiple definitions are found for the same key in index ">;
Expand Down
65 changes: 47 additions & 18 deletions clang/lib/CrossTU/CrossTranslationUnit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,35 @@ std::error_code IndexError::convertToErrorCode() const {
return std::error_code(static_cast<int>(Code), *Category);
}

/// Parse one line of the input CTU index file.
///
/// @param[in] LineRef The input CTU index item in format
/// "<USR-Length>:<USR> <File-Path>".
/// @param[out] LookupName The lookup name in format "<USR-Length>:<USR>".
/// @param[out] FilePath The file path "<File-Path>".
static bool parseCrossTUIndexItem(StringRef LineRef, StringRef &LookupName,
StringRef &FilePath) {
// `LineRef` is "<USR-Length>:<USR> <File-Path>" now.

size_t USRLength = 0;
if (LineRef.consumeInteger(10, USRLength))
return false;
assert(USRLength && "USRLength should be greater than zero.");

if (!LineRef.consume_front(":"))
return false;

// `LineRef` is now just "<USR> <File-Path>".

// Check LookupName length out of bound and incorrect delimiter.
if (USRLength >= LineRef.size() || ' ' != LineRef[USRLength])
return false;

LookupName = LineRef.substr(0, USRLength);
FilePath = LineRef.substr(USRLength + 1);
return true;
}

llvm::Expected<llvm::StringMap<std::string>>
parseCrossTUIndex(StringRef IndexPath) {
std::ifstream ExternalMapFile{std::string(IndexPath)};
Expand All @@ -160,24 +189,23 @@ parseCrossTUIndex(StringRef IndexPath) {
std::string Line;
unsigned LineNo = 1;
while (std::getline(ExternalMapFile, Line)) {
StringRef LineRef{Line};
const size_t Delimiter = LineRef.find(' ');
if (Delimiter > 0 && Delimiter != std::string::npos) {
StringRef LookupName = LineRef.substr(0, Delimiter);

// Store paths with posix-style directory separator.
SmallString<32> FilePath(LineRef.substr(Delimiter + 1));
llvm::sys::path::native(FilePath, llvm::sys::path::Style::posix);

bool InsertionOccured;
std::tie(std::ignore, InsertionOccured) =
Result.try_emplace(LookupName, FilePath.begin(), FilePath.end());
if (!InsertionOccured)
return llvm::make_error<IndexError>(
index_error_code::multiple_definitions, IndexPath.str(), LineNo);
} else
// Split lookup name and file path
StringRef LookupName, FilePathInIndex;
if (!parseCrossTUIndexItem(Line, LookupName, FilePathInIndex))
return llvm::make_error<IndexError>(
index_error_code::invalid_index_format, IndexPath.str(), LineNo);

// Store paths with posix-style directory separator.
SmallString<32> FilePath(FilePathInIndex);
llvm::sys::path::native(FilePath, llvm::sys::path::Style::posix);

bool InsertionOccured;
std::tie(std::ignore, InsertionOccured) =
Result.try_emplace(LookupName, FilePath.begin(), FilePath.end());
if (!InsertionOccured)
return llvm::make_error<IndexError>(
index_error_code::multiple_definitions, IndexPath.str(), LineNo);

++LineNo;
}
return Result;
Expand All @@ -187,7 +215,8 @@ std::string
createCrossTUIndexString(const llvm::StringMap<std::string> &Index) {
std::ostringstream Result;
for (const auto &E : Index)
Result << E.getKey().str() << " " << E.getValue() << '\n';
Result << E.getKey().size() << ':' << E.getKey().str() << ' '
<< E.getValue() << '\n';
return Result.str();
}

Expand Down Expand Up @@ -428,7 +457,7 @@ CrossTranslationUnitContext::ASTUnitStorage::getASTUnitForFunction(
ensureCTUIndexLoaded(CrossTUDir, IndexName))
return std::move(IndexLoadError);

// Check if there is and entry in the index for the function.
// Check if there is an entry in the index for the function.
if (!NameFileMap.count(FunctionName)) {
++NumNotInOtherTU;
return llvm::make_error<IndexError>(index_error_code::missing_definition);
Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1 @@
c:@F@testStaticImplicit ctu-import.c.ast
23:c:@F@testStaticImplicit ctu-import.c.ast
17 changes: 17 additions & 0 deletions clang/test/Analysis/Inputs/ctu-lookup-name-with-space.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
void f(void (*)());
void f(void (*)(int));

struct G {
G() {
// multiple definitions are found for the same key in index
f([]() -> void {}); // USR: c:@S@G@F@G#@Sa@F@operator void (*)()#1
f([](int) -> void {}); // USR: c:@S@G@F@G#@Sa@F@operator void (*)(int)#1

// As both lambda exprs have the same prefix, if the CTU index parser uses
// the first space character as the delimiter between USR and file path, a
// "multiple definitions are found for the same key in index" error will
// be reported.
}
};

void importee() {}
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
c:@F@inlineAsm ctu-other.c.ast
c:@F@g ctu-other.c.ast
c:@F@f ctu-other.c.ast
c:@F@enumCheck ctu-other.c.ast
c:@F@identImplicit ctu-other.c.ast
c:@F@structInProto ctu-other.c.ast
c:@F@switchWithoutCases ctu-other.c.ast
14:c:@F@inlineAsm ctu-other.c.ast
6:c:@F@g ctu-other.c.ast
6:c:@F@f ctu-other.c.ast
14:c:@F@enumCheck ctu-other.c.ast
18:c:@F@identImplicit ctu-other.c.ast
18:c:@F@structInProto ctu-other.c.ast
23:c:@F@switchWithoutCases ctu-other.c.ast
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
c:@N@chns@F@chf1#I# ctu-other.cpp.ast
c:@N@myns@N@embed_ns@F@fens#I# ctu-other.cpp.ast
c:@F@g#I# ctu-other.cpp.ast
c:@S@mycls@F@fscl#I#S ctu-other.cpp.ast
c:@S@mycls@F@fcl#I# ctu-other.cpp.ast
c:@S@mycls@F@fvcl#I# ctu-other.cpp.ast
c:@N@myns@S@embed_cls@F@fecl#I# ctu-other.cpp.ast
c:@S@mycls@S@embed_cls2@F@fecl2#I# ctu-other.cpp.ast
c:@S@derived@F@fvcl#I# ctu-other.cpp.ast
c:@F@f#I# ctu-other.cpp.ast
c:@N@myns@F@fns#I# ctu-other.cpp.ast
c:@F@h#I# ctu-other.cpp.ast
c:@F@h_chain#I# ctu-chain.cpp.ast
c:@N@chns@S@chcls@F@chf4#I# ctu-chain.cpp.ast
c:@N@chns@F@chf2#I# ctu-chain.cpp.ast
c:@F@fun_using_anon_struct#I# ctu-other.cpp.ast
c:@F@other_macro_diag#I# ctu-other.cpp.ast
c:@extInt ctu-other.cpp.ast
c:@N@intns@extInt ctu-other.cpp.ast
c:@extS ctu-other.cpp.ast
c:@S@A@a ctu-other.cpp.ast
c:@extSC ctu-other.cpp.ast
c:@S@ST@sc ctu-other.cpp.ast
c:@extSCN ctu-other.cpp.ast
c:@extSubSCN ctu-other.cpp.ast
c:@extSCC ctu-other.cpp.ast
c:@extU ctu-other.cpp.ast
c:@S@TestAnonUnionUSR@Test ctu-other.cpp.ast
c:@F@testImportOfIncompleteDefaultParmDuringImport#I# ctu-other.cpp.ast
c:@F@testImportOfDelegateConstructor#I# ctu-other.cpp.ast
19:c:@N@chns@F@chf1#I# ctu-other.cpp.ast
30:c:@N@myns@N@embed_ns@F@fens#I# ctu-other.cpp.ast
9:c:@F@g#I# ctu-other.cpp.ast
21:c:@S@mycls@F@fscl#I#S ctu-other.cpp.ast
19:c:@S@mycls@F@fcl#I# ctu-other.cpp.ast
20:c:@S@mycls@F@fvcl#I# ctu-other.cpp.ast
31:c:@N@myns@S@embed_cls@F@fecl#I# ctu-other.cpp.ast
34:c:@S@mycls@S@embed_cls2@F@fecl2#I# ctu-other.cpp.ast
22:c:@S@derived@F@fvcl#I# ctu-other.cpp.ast
9:c:@F@f#I# ctu-other.cpp.ast
18:c:@N@myns@F@fns#I# ctu-other.cpp.ast
9:c:@F@h#I# ctu-other.cpp.ast
15:c:@F@h_chain#I# ctu-chain.cpp.ast
27:c:@N@chns@S@chcls@F@chf4#I# ctu-chain.cpp.ast
19:c:@N@chns@F@chf2#I# ctu-chain.cpp.ast
29:c:@F@fun_using_anon_struct#I# ctu-other.cpp.ast
24:c:@F@other_macro_diag#I# ctu-other.cpp.ast
9:c:@extInt ctu-other.cpp.ast
17:c:@N@intns@extInt ctu-other.cpp.ast
7:c:@extS ctu-other.cpp.ast
8:c:@S@A@a ctu-other.cpp.ast
8:c:@extSC ctu-other.cpp.ast
10:c:@S@ST@sc ctu-other.cpp.ast
9:c:@extSCN ctu-other.cpp.ast
12:c:@extSubSCN ctu-other.cpp.ast
9:c:@extSCC ctu-other.cpp.ast
7:c:@extU ctu-other.cpp.ast
26:c:@S@TestAnonUnionUSR@Test ctu-other.cpp.ast
53:c:@F@testImportOfIncompleteDefaultParmDuringImport#I# ctu-other.cpp.ast
39:c:@F@testImportOfDelegateConstructor#I# ctu-other.cpp.ast
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
c:@F@F1 plist-macros-ctu.c.ast
c:@F@F2 plist-macros-ctu.c.ast
c:@F@F3 plist-macros-ctu.c.ast
c:@F@F_H plist-macros-ctu.c.ast
7:c:@F@F1 plist-macros-ctu.c.ast
7:c:@F@F2 plist-macros-ctu.c.ast
7:c:@F@F3 plist-macros-ctu.c.ast
8:c:@F@F_H plist-macros-ctu.c.ast
2 changes: 1 addition & 1 deletion clang/test/Analysis/ctu-inherited-default-ctor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
// RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \
// RUN: -emit-pch -o %t/ctudir/ctu-inherited-default-ctor-other.cpp.ast \
// RUN: %S/Inputs/ctu-inherited-default-ctor-other.cpp
// RUN: echo "c:@N@clang@S@DeclContextLookupResult@SingleElementDummyList ctu-inherited-default-ctor-other.cpp.ast" \
// RUN: echo "59:c:@N@clang@S@DeclContextLookupResult@SingleElementDummyList ctu-inherited-default-ctor-other.cpp.ast" \
// RUN: > %t/ctudir/externalDefMap.txt
//
// RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \
Expand Down
41 changes: 41 additions & 0 deletions clang/test/Analysis/ctu-lookup-name-with-space.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// RUN: rm -rf %t
// RUN: mkdir %t

// RUN: echo '41:c:@S@G@F@G#@Sa@F@operator void (*)(int)#1 %/t/importee.ast' >> %t/externalDefMap.txt
// RUN: echo '38:c:@S@G@F@G#@Sa@F@operator void (*)()#1 %/t/importee.ast' >> %t/externalDefMap.txt
// RUN: echo '14:c:@F@importee# %/t/importee.ast' >> %t/externalDefMap.txt
// RUN: %clang_cc1 -emit-pch %/S/Inputs/ctu-lookup-name-with-space.cpp -o %t/importee.ast

// RUN: cd %t
// RUN: %clang_cc1 -fsyntax-only -analyze \
// RUN: -analyzer-checker=core \
// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \
// RUN: -analyzer-config ctu-dir=. \
// RUN: -analyzer-config display-ctu-progress=true \
// RUN: -verify %s 2>&1 | FileCheck %s

// CHECK: CTU loaded AST file

// FIXME: In this test case, we cannot use the on-demand-parsing approach to
// load the external TU.
//
// In the Darwin system, the target triple is determined by the driver,
// rather than using the default one like other systems. However, when
// using bare `clang -cc1`, the adjustment is not done, which cannot
// match the one loaded with on-demand-parsing (adjusted triple).
// We bypass this problem by loading AST files, whose target triple is
// also unadjusted when generated via `clang -cc1 -emit-pch`.
//
// Refer to: https://discourse.llvm.org/t/60762
//
// This is also the reason why the test case of D75665 (introducing
// the on-demand-parsing feature) is enabled only on Linux.

void importee();

void trigger() {
// Call an external function to trigger the parsing process of CTU index.
// Refer to file Inputs/ctu-lookup-name-with-space.cpp for more details.

importee(); // expected-no-diagnostics
}
26 changes: 19 additions & 7 deletions clang/test/Analysis/func-mapping-test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
int f(int) {
return 0;
}
// CHECK-DAG: c:@F@f#I#
// CHECK-DAG: 9:c:@F@f#I#

extern const int x = 5;
// CHECK-DAG: c:@x
// CHECK-DAG: 4:c:@x

// Non-const variables should not be collected.
int y = 5;
Expand All @@ -18,33 +18,45 @@ struct S {
int a;
};
extern S const s = {.a = 2};
// CHECK-DAG: c:@s
// CHECK-DAG: 4:c:@s

struct SF {
const int a;
};
SF sf = {.a = 2};
// CHECK-DAG: c:@sf
// CHECK-DAG: 5:c:@sf

struct SStatic {
static const int a = 4;
};
const int SStatic::a;
// CHECK-DAG: c:@S@SStatic@a
// CHECK-DAG: 14:c:@S@SStatic@a

extern int const arr[5] = { 0, 1 };
// CHECK-DAG: c:@arr
// CHECK-DAG: 6:c:@arr

union U {
const int a;
const unsigned int b;
};
U u = {.a = 6};
// CHECK-DAG: c:@u
// CHECK-DAG: 4:c:@u

// No USR can be generated for this.
// Check for no crash in this case.
static union {
float uf;
const int ui;
};

void f(int (*)(char));
void f(bool (*)(char));

struct G {
G() {
f([](char) -> int { return 42; });
// CHECK-DAG: 41:c:@S@G@F@G#@Sa@F@operator int (*)(char)#1
f([](char) -> bool { return true; });
// CHECK-DAG: 42:c:@S@G@F@G#@Sa@F@operator bool (*)(char)#1
}
};
2 changes: 1 addition & 1 deletion clang/unittests/CrossTU/CrossTranslationUnitTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ class CTUASTConsumer : public clang::ASTConsumer {
ASSERT_FALSE(llvm::sys::fs::createTemporaryFile("index", "txt", IndexFD,
IndexFileName));
llvm::ToolOutputFile IndexFile(IndexFileName, IndexFD);
IndexFile.os() << "c:@F@f#I# " << ASTFileName << "\n";
IndexFile.os() << "9:c:@F@f#I# " << ASTFileName << "\n";
IndexFile.os().flush();
EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName));

Expand Down

0 comments on commit 9f90254

Please sign in to comment.