-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[analyzer] New optin.taint.TaintedAlloc checker for catching unbounded memory allocation calls #92420
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-static-analyzer-1 Author: Daniel Krupp (dkrupp) Changesunix.Malloc checker will warn if a memory allocation function (malloc, calloc, realloc, alloca) is called with a tainted (attacker controlled) size parameter. The warning will only be emitted, if the analyzer cannot prove that the size is below reasonable bounds (<SIZE_MAX/4). There were no new reports with this extension on the following open source projects.
Full diff: https://github.com/llvm/llvm-project/pull/92420.diff 3 Files Affected:
diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index eb8b58323da4d..0cdf6dcab2875 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -1273,6 +1273,41 @@ Check for memory leaks, double free, and use-after-free problems. Traces memory
.. literalinclude:: checkers/unix_malloc_example.c
:language: c
+If the ``alpha.security.taint.TaintPropagation`` checker is enabled, the checker
+warns for cases when the ``size`` parameter of the ``malloc`` , ``calloc``,
+``realloc``, ``alloca`` is tainted (potentially attacker controlled). If an
+attacker can inject a large value as the size parameter, memory exhaustion
+denial of service attack can be carried out.
+
+The analyzer emits warning only if it cannot prove that the size parameter is
+within reasonable bounds (``<= SIZE_MAX/4``). This functionality partially
+covers the SEI Cert coding standard rule `INT04-C
+<https://wiki.sei.cmu.edu/confluence/display/c/INT04-C.+Enforce+limits+on+integer+values+originating+from+tainted+sources>`_.
+
+You can silence this warning either by bound checking the ``size`` parameter, or
+by explicitly marking the ``size`` parameter as sanitized. See the
+:ref:`alpha-security-taint-TaintPropagation` checker for more details.
+
+.. code-block:: c
+
+ void t1(void) {
+ size_t size;
+ scanf("%zu", &size);
+ int *p = malloc(size); // warn: malloc is called with a tainted (potentially attacker controlled) value
+ free(p);
+ }
+
+ void t3(void) {
+ size_t size;
+ scanf("%zu", &size);
+ if (1024<size)
+ return;
+ int *p = malloc(size); // No warning expected as the the user input is bound
+ free(p);
+ }
+
+.. _unix-MismatchedDeallocator:
+
.. _unix-MallocSizeof:
unix.MallocSizeof (C)
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index dd204b62dcc04..2cc9205c07814 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -60,6 +60,7 @@
#include "clang/Basic/TargetInfo.h"
#include "clang/Lex/Lexer.h"
#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Checkers/Taint.h"
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/BugReporter/CommonBugCategories.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
@@ -365,6 +366,7 @@ class MallocChecker
mutable std::unique_ptr<BugType> BT_MismatchedDealloc;
mutable std::unique_ptr<BugType> BT_OffsetFree[CK_NumCheckKinds];
mutable std::unique_ptr<BugType> BT_UseZerroAllocated[CK_NumCheckKinds];
+ mutable std::unique_ptr<BugType> BT_TaintedAlloc[CK_NumCheckKinds];
#define CHECK_FN(NAME) \
void NAME(const CallEvent &Call, CheckerContext &C) const;
@@ -462,6 +464,13 @@ class MallocChecker
};
bool isMemCall(const CallEvent &Call) const;
+ void reportTaintBug(StringRef Msg, ProgramStateRef State, CheckerContext &C,
+ llvm::ArrayRef<SymbolRef> TaintedSyms,
+ AllocationFamily Family, const Expr *SizeEx) const;
+
+ void CheckTaintedness(CheckerContext &C, const CallEvent &Call,
+ const SVal SizeSVal, ProgramStateRef State,
+ AllocationFamily Family) const;
// TODO: Remove mutable by moving the initializtaion to the registry function.
mutable std::optional<uint64_t> KernelZeroFlagVal;
@@ -521,9 +530,9 @@ class MallocChecker
/// malloc leaves it undefined.
/// \param [in] State The \c ProgramState right before allocation.
/// \returns The ProgramState right after allocation.
- [[nodiscard]] static ProgramStateRef
+ [[nodiscard]] ProgramStateRef
MallocMemAux(CheckerContext &C, const CallEvent &Call, const Expr *SizeEx,
- SVal Init, ProgramStateRef State, AllocationFamily Family);
+ SVal Init, ProgramStateRef State, AllocationFamily Family) const;
/// Models memory allocation.
///
@@ -534,9 +543,10 @@ class MallocChecker
/// malloc leaves it undefined.
/// \param [in] State The \c ProgramState right before allocation.
/// \returns The ProgramState right after allocation.
- [[nodiscard]] static ProgramStateRef
- MallocMemAux(CheckerContext &C, const CallEvent &Call, SVal Size, SVal Init,
- ProgramStateRef State, AllocationFamily Family);
+ [[nodiscard]] ProgramStateRef MallocMemAux(CheckerContext &C,
+ const CallEvent &Call, SVal Size,
+ SVal Init, ProgramStateRef State,
+ AllocationFamily Family) const;
// Check if this malloc() for special flags. At present that means M_ZERO or
// __GFP_ZERO (in which case, treat it like calloc).
@@ -649,8 +659,9 @@ class MallocChecker
/// \param [in] Call The expression that reallocated memory
/// \param [in] State The \c ProgramState right before reallocation.
/// \returns The ProgramState right after allocation.
- [[nodiscard]] static ProgramStateRef
- CallocMem(CheckerContext &C, const CallEvent &Call, ProgramStateRef State);
+ [[nodiscard]] ProgramStateRef CallocMem(CheckerContext &C,
+ const CallEvent &Call,
+ ProgramStateRef State) const;
/// See if deallocation happens in a suspicious context. If so, escape the
/// pointers that otherwise would have been deallocated and return true.
@@ -1779,7 +1790,7 @@ ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
const CallEvent &Call,
const Expr *SizeEx, SVal Init,
ProgramStateRef State,
- AllocationFamily Family) {
+ AllocationFamily Family) const {
if (!State)
return nullptr;
@@ -1787,10 +1798,71 @@ ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
return MallocMemAux(C, Call, C.getSVal(SizeEx), Init, State, Family);
}
+void MallocChecker::reportTaintBug(StringRef Msg, ProgramStateRef State,
+ CheckerContext &C,
+ llvm::ArrayRef<SymbolRef> TaintedSyms,
+ AllocationFamily Family,
+ const Expr *SizeEx) const {
+ if (ExplodedNode *N = C.generateErrorNode(State)) {
+
+ std::optional<MallocChecker::CheckKind> CheckKind =
+ getCheckIfTracked(Family);
+ if (!CheckKind)
+ return;
+ if (!BT_TaintedAlloc[*CheckKind])
+ BT_TaintedAlloc[*CheckKind].reset(new BugType(CheckNames[*CheckKind],
+ "Tainted Memory Allocation",
+ categories::MemoryError));
+ auto R = std::make_unique<PathSensitiveBugReport>(
+ *BT_TaintedAlloc[*CheckKind], Msg, N);
+
+ bugreporter::trackExpressionValue(N, SizeEx, *R);
+ for (auto Sym : TaintedSyms)
+ R->markInteresting(Sym);
+ C.emitReport(std::move(R));
+ }
+}
+
+void MallocChecker::CheckTaintedness(CheckerContext &C, const CallEvent &Call,
+ const SVal SizeSVal, ProgramStateRef State,
+ AllocationFamily Family) const {
+ std::vector<SymbolRef> TaintedSyms =
+ clang::ento::taint::getTaintedSymbols(State, SizeSVal);
+ if (!TaintedSyms.empty()) {
+ SValBuilder &SVB = C.getSValBuilder();
+ QualType SizeTy = SVB.getContext().getSizeType();
+ QualType CmpTy = SVB.getConditionType();
+ // In case the symbol is tainted, we give a warning if the
+ // size is larger than SIZE_MAX/4
+ BasicValueFactory &BVF = SVB.getBasicValueFactory();
+ const llvm::APSInt MaxValInt = BVF.getMaxValue(SizeTy);
+ NonLoc MaxLength =
+ SVB.makeIntVal(MaxValInt / APSIntType(MaxValInt).getValue(4));
+ std::optional<NonLoc> SizeNL = SizeSVal.getAs<NonLoc>();
+ auto Cmp = SVB.evalBinOpNN(State, BO_GE, *SizeNL, MaxLength, CmpTy)
+ .getAs<DefinedOrUnknownSVal>();
+ if (!Cmp)
+ return;
+ auto [StateTooLarge, StateNotTooLarge] = State->assume(*Cmp);
+ if (!StateTooLarge && StateNotTooLarge) {
+ // we can prove that size is not too large so ok.
+ return;
+ }
+
+ std::string Callee = "Memory allocation function";
+ if (Call.getCalleeIdentifier())
+ Callee = Call.getCalleeIdentifier()->getName().str();
+ reportTaintBug(
+ Callee + " is called with a tainted (potentially attacker controlled) "
+ "value. Make sure the value is bound checked.",
+ State, C, TaintedSyms, Family, Call.getArgExpr(0));
+ }
+}
+
ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
const CallEvent &Call, SVal Size,
SVal Init, ProgramStateRef State,
- AllocationFamily Family) {
+ AllocationFamily Family) const {
if (!State)
return nullptr;
@@ -1819,9 +1891,7 @@ ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
if (Size.isUndef())
Size = UnknownVal();
- // TODO: If Size is tainted and we cannot prove that it is within
- // reasonable bounds, emit a warning that an attacker may
- // provoke a memory exhaustion error.
+ CheckTaintedness(C, Call, Size, State, AF_Malloc);
// Set the region's extent.
State = setDynamicExtent(State, RetVal.getAsRegion(),
@@ -2761,7 +2831,7 @@ MallocChecker::ReallocMemAux(CheckerContext &C, const CallEvent &Call,
ProgramStateRef MallocChecker::CallocMem(CheckerContext &C,
const CallEvent &Call,
- ProgramStateRef State) {
+ ProgramStateRef State) const {
if (!State)
return nullptr;
diff --git a/clang/test/Analysis/malloc.c b/clang/test/Analysis/malloc.c
index e5cb45ba73352..6dba76a57d83f 100644
--- a/clang/test/Analysis/malloc.c
+++ b/clang/test/Analysis/malloc.c
@@ -3,7 +3,8 @@
// RUN: -analyzer-checker=alpha.deadcode.UnreachableCode \
// RUN: -analyzer-checker=alpha.core.CastSize \
// RUN: -analyzer-checker=unix \
-// RUN: -analyzer-checker=debug.ExprInspection
+// RUN: -analyzer-checker=debug.ExprInspection \
+// RUN: -analyzer-checker=alpha.security.taint.TaintPropagation
#include "Inputs/system-header-simulator.h"
@@ -48,6 +49,45 @@ void myfoo(int *p);
void myfooint(int p);
char *fooRetPtr(void);
+void t1(void) {
+ size_t size;
+ scanf("%zu", &size);
+ int *p = malloc(size); // expected-warning{{malloc is called with a tainted (potentially attacker controlled) value}}
+ free(p);
+}
+
+void t2(void) {
+ size_t size;
+ scanf("%zu", &size);
+ int *p = calloc(size,2); // expected-warning{{calloc is called with a tainted (potentially attacker controlled) value}}
+ free(p);
+}
+
+void t3(void) {
+ size_t size;
+ scanf("%zu", &size);
+ if (1024<size)
+ return;
+ int *p = malloc(size); // No warning expected as the the user input is bound
+ free(p);
+}
+
+void t4(void) {
+ size_t size;
+ int *p = malloc(sizeof(int));
+ scanf("%zu", &size);
+ p = (int*) realloc((void*) p, size); // // expected-warning{{realloc is called with a tainted (potentially attacker controlled) value}}
+ free(p);
+}
+
+void t5(void) {
+ size_t size;
+ int *p = alloca(sizeof(int));
+ scanf("%zu", &size);
+ p = (int*) alloca(size); // // expected-warning{{alloca is called with a tainted (potentially attacker controlled) value}}
+}
+
+
void f1(void) {
int *p = malloc(12);
return; // expected-warning{{Potential leak of memory pointed to by 'p'}}
|
I think there should be a way to enable/disable this check separately because memory exhaustion / denial of service isn't necessarily something you care about when you enable taint analysis. It's essential for web servers when the attacker is interested in interrupting their operation but not necessarily for personal devices where the attacker is interested only in gaining control of the device. For these applications it's more important to catch cases where malloc size and index used for access are coming from "different sources", eg. one is tainted and another isn't, doesn't matter which one. For the same reason, I think the error node needs to be non-fatal. If you make it fatal, you lose the opportunity to catch this other type of bugs on the same path, which are 50% likely to be found on the same path, and are arguably much more severe. Of course when your new check is disabled there won't be a sink so we'll still catch the other bug. But we don't actually want it to work this way; we'd rather have the user control the warnings they want to see with simple on-off flags without weird interactions between those flags. So sink generation should be mostly a thing that "modeling" checkers do, when they're confident that we simply cannot continue analysis further. In this case I don't really see any problems with continuing the analysis so it should probably be a non-fatal error. |
Good point, I completely agree.
I agree that we should enable the "pessimistic" handling (= report an error if we can't prove that the index is in bounds) for cases when the size is tainted, and I'm planning to create a commit that does this in I would not emphasize "different sources" in this context: a tainted index deserves careful handling even if the offset is also tainted. I see that this could produce false positives in situations like
where (AFAIK) the analyzer cannot conclude that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch makes sense to me.
I'll not repeat the existing comments, they raise relevant concerns.
It would be nice to extend some test case with a tainted malloc to see how those note tags play out from the generic taint checker in this context. For this, I'd suggest you to have a look at some taint tests where we enable the text
diagnostic output.
✅ With the latest revision this PR passed the C/C++ code formatter. |
Thanks for the reviews. I updated the patch.
-Minor changes addressed as requested. Could you please check again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch makes sense to me.
Have you considered applying the same heuristic to C++ array new allocations?
I'll port this patch downstream to see how this would behave on the Juliet C++ benchmark or on some real-world code.
Ah nvm. llvm/main diverged quite a bit since 18.1.6. I can't just pick this one. Given this, I won't backport and test this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I noticed that I had some minor review commits that were left in a "Pending" state for at least a week. As the commit is not yet merged, I'm publishing them now.
From the comment of @steakhal
Actually, I'm almost certain that this patch already affects the C++ array new allocations, because @dkrupp Please add a C++ test file with a few testcases which show the behavior of your commit when an unconstrained tainted size is passed to |
I will check C++
I will consider adding the heuristic for C++ array new allocations in a follow-up patch. |
@NagyDonat , @steakhal I fixed the additional remarks. |
I see now, that there is still one unaddressed remark from @NagyDonat regarding a new testcase for array new allocations. I will be adding it tomorrow... |
unix.Malloc checker will warn if a memory allocation function (malloc, calloc, realloc, alloca) is called with a tainted (attacker controlled) size parameter. A large, maliciously set size value can trigger memory exhaustion. To get this warning, the alpha.security.taint.TaintPropagation checker also needs to be switched on. The warning will only be emitted, if the analyzer cannot prove that the size is below reasonable bounds (<SIZE_MAX/4).
- Create a new optional checker optin.taint.TaintMalloc - Add test case for testing taint diagnostic notes
- Handling of C++ operator new[] allocation was added to the checker with test cases - The checker is renamed to optin.taint.TaintAlloc, as besides malloc it handles the c++ new array allocations too - Test cases and documentation was updated
@NagyDonat , @steakhal please check if any more update is needed. thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that the commit is acceptable as it is now, but I added several inline comments for minor prettification issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the update!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I join Donát, and I agree that this looks good as it is.
I had a handful of final remarks but I have no strong opinion on any of the raised points.
Merge this, once you considered them and took action if you agreed.
In the latest commit I fixed all remaining review comments. GenericTaintchecker should be a dependency as mentioned in the FIXME, but it cannot be one until the checker is not a modeling checker. This separation will be done in a later follow-up patch. Until then, the documentation indicates the that alpha.security.taint.TaintPropagation checker should be switched on for this checker to work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change LGTM. I agree with @steakhal that TaintedAlloc
would be a slightly better name, but the current one is also acceptable.
Now the checker is renamed to optin.taint.TaintedAlloc as requested by the reviewers. |
A new optional checker (optin.taint.TaintedAlloc) will warn if a memory allocation function (malloc, calloc, realloc, alloca, operator new[]) is called with a tainted (attacker controlled) size parameter.
A large, maliciously set size value can trigger memory exhaustion. To get this warning, the alpha.security.taint.TaintPropagation checker also needs to be switched on.
The warning will only be emitted, if the analyzer cannot prove that the size is below reasonable bounds (<SIZE_MAX/4).
There were no new reports with this extension on the following open source projects.