Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb][DataFormatter] Simplify std::map formatter #97579

Merged

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Jul 3, 2024

Depends on:

This patch tries to simplify the way in which the std::map formatter goes from the root __tree pointer to a specific key/value pair.

Previously we would:

  1. synthesize a structure that mimicked what __iter_pointer looked like in memory
  2. call GetChildCompilerTypeAtIndex on it to find the byte offset at which the pair was located in the synthesized structure
  3. finally, use that offset through a call to GetSyntheticChildAtOffset to retrieve the key/value pair

Not only was this logic hard to follow, and encoded the libc++ layout in non-obvious ways, it was also fragile to alignment miscalculations (#97443); this would break once the new layout of std::map landed as part of #93069.

Instead, this patch simply casts the __iter_pointer to the __node_pointer and uses a straightforward GetChildMemberWithName("__value_") to get to the key/value we care about. This allows us to get rid of some support infrastructure/class state.

Ideally we would fix the underlying alignment issue, but this unblocks the libc++ refactor in the interim, while also benefitting the formatter in terms of readability (in my opinion).

@llvmbot
Copy link
Member

llvmbot commented Jul 3, 2024

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

Depends on:

This patch tries to simplify the way in which the
std::map formatter goes from the root __tree
pointer to a specific key/value pair.

Previously we would synthesize a structure that
mimicked what __iter_pointer looked like
in memory, then called GetChildCompilerTypeAtIndex
on it to find the byte offset into that structure
at which the pair was located at, and finally
use that offset through a call to GetSyntheticChildAtOffset
to retrieve that pair. Not only was this logic hard to follow,
and encoded the libc++ layout in non-obvious ways, it was also
fragile to alignment miscalculations
(#97443); this would
break after the new layout of std::map landed as part of
inhttps://github.com/#93069.

Instead, this patch simply casts the __iter_pointer to
the __node_pointer and uses a straightforward
GetChildMemberWithName("__value_") to get to the key/value
we care about. This allows us to get rid of some support
infrastructure/class state.

Ideally we would fix the underlying alignment issue, but
this unblocks the libc++ refactor in the interim,
while also benefitting the formatter in terms of readability (in my opinion).


Full diff: https://github.com/llvm/llvm-project/pull/97579.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp (+93-140)
diff --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp b/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
index 44fe294ced722..120f3ac131b34 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxxMap.cpp
@@ -17,11 +17,32 @@
 #include "lldb/Utility/Endian.h"
 #include "lldb/Utility/Status.h"
 #include "lldb/Utility/Stream.h"
+#include "lldb/lldb-forward.h"
+#include <cstdint>
+#include <locale>
+#include <optional>
 
 using namespace lldb;
 using namespace lldb_private;
 using namespace lldb_private::formatters;
 
+// The flattened layout of the std::__tree_iterator::__ptr_ looks
+// as follows:
+//
+// The following shows the contiguous block of memory:
+//
+//        +-----------------------------+ class __tree_end_node
+// __ptr_ | pointer __left_;            |
+//        +-----------------------------+ class __tree_node_base
+//        | pointer __right_;           |
+//        | __parent_pointer __parent_; |
+//        | bool __is_black_;           |
+//        +-----------------------------+ class __tree_node
+//        | __node_value_type __value_; | <<< our key/value pair
+//        +-----------------------------+
+//
+// where __ptr_ has type __iter_pointer.
+
 class MapEntry {
 public:
   MapEntry() = default;
@@ -180,14 +201,25 @@ class LibcxxStdMapSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
   size_t GetIndexOfChildWithName(ConstString name) override;
 
 private:
-  bool GetDataType();
-
-  void GetValueOffset(const lldb::ValueObjectSP &node);
+  /// Returns the ValueObject for the __tree_node type that
+  /// holds the key/value pair of the node at index \ref idx.
+  ///
+  /// \param[in] idx The child index that we're looking to get
+  ///                the key/value pair for.
+  ///
+  /// \param[in] max_depth The maximum search depth after which
+  ///                      we stop trying to find the key/value
+  ///                      pair for.
+  ///
+  /// \returns On success, returns the ValueObjectSP corresponding
+  ///          to the __tree_node's __value_ member (which holds
+  ///          the key/value pair the formatter wants to display).
+  ///          On failure, will return nullptr.
+  ValueObjectSP GetKeyValuePair(size_t idx, size_t max_depth);
 
   ValueObject *m_tree = nullptr;
   ValueObject *m_root_node = nullptr;
-  CompilerType m_element_type;
-  uint32_t m_skip_size = UINT32_MAX;
+  CompilerType m_node_ptr_type;
   size_t m_count = UINT32_MAX;
   std::map<size_t, MapIterator> m_iterators;
 };
@@ -196,7 +228,7 @@ class LibcxxStdMapSyntheticFrontEnd : public SyntheticChildrenFrontEnd {
 
 lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::
     LibcxxStdMapSyntheticFrontEnd(lldb::ValueObjectSP valobj_sp)
-    : SyntheticChildrenFrontEnd(*valobj_sp), m_element_type(), m_iterators() {
+    : SyntheticChildrenFrontEnd(*valobj_sp) {
   if (valobj_sp)
     Update();
 }
@@ -222,81 +254,52 @@ llvm::Expected<uint32_t> lldb_private::formatters::
   return m_count;
 }
 
-bool lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetDataType() {
-  if (m_element_type.IsValid())
-    return true;
-  m_element_type.Clear();
-  ValueObjectSP deref;
-  Status error;
-  deref = m_root_node->Dereference(error);
-  if (!deref || error.Fail())
-    return false;
-  deref = deref->GetChildMemberWithName("__value_");
-  if (deref) {
-    m_element_type = deref->GetCompilerType();
-    return true;
-  }
-  deref = m_backend.GetChildAtNamePath({"__tree_", "__pair3_"});
-  if (!deref)
-    return false;
-  m_element_type = deref->GetCompilerType()
-                       .GetTypeTemplateArgument(1)
-                       .GetTypeTemplateArgument(1);
-  if (m_element_type) {
-    std::string name;
-    uint64_t bit_offset_ptr;
-    uint32_t bitfield_bit_size_ptr;
-    bool is_bitfield_ptr;
-    m_element_type = m_element_type.GetFieldAtIndex(
-        0, name, &bit_offset_ptr, &bitfield_bit_size_ptr, &is_bitfield_ptr);
-    m_element_type = m_element_type.GetTypedefedType();
-    return m_element_type.IsValid();
-  } else {
-    m_element_type = m_backend.GetCompilerType().GetTypeTemplateArgument(0);
-    return m_element_type.IsValid();
-  }
-}
+ValueObjectSP
+lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetKeyValuePair(
+    size_t idx, size_t max_depth) {
+  MapIterator iterator(m_root_node, max_depth);
 
-void lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetValueOffset(
-    const lldb::ValueObjectSP &node) {
-  if (m_skip_size != UINT32_MAX)
-    return;
-  if (!node)
-    return;
-  CompilerType node_type(node->GetCompilerType());
-  uint64_t bit_offset;
-  if (node_type.GetIndexOfFieldWithName("__value_", nullptr, &bit_offset) !=
-      UINT32_MAX) {
-    // Old layout (pre 089a7cc5dea)
-    m_skip_size = bit_offset / 8u;
-  } else {
-    auto ast_ctx = node_type.GetTypeSystem().dyn_cast_or_null<TypeSystemClang>();
-    if (!ast_ctx)
-      return;
-    CompilerType tree_node_type = ast_ctx->CreateStructForIdentifier(
-        llvm::StringRef(),
-        {{"ptr0", ast_ctx->GetBasicType(lldb::eBasicTypeVoid).GetPointerType()},
-         {"ptr1", ast_ctx->GetBasicType(lldb::eBasicTypeVoid).GetPointerType()},
-         {"ptr2", ast_ctx->GetBasicType(lldb::eBasicTypeVoid).GetPointerType()},
-         {"cw", ast_ctx->GetBasicType(lldb::eBasicTypeBool)},
-         {"payload", (m_element_type.GetCompleteType(), m_element_type)}});
-    std::string child_name;
-    uint32_t child_byte_size;
-    int32_t child_byte_offset = 0;
-    uint32_t child_bitfield_bit_size;
-    uint32_t child_bitfield_bit_offset;
-    bool child_is_base_class;
-    bool child_is_deref_of_parent;
-    uint64_t language_flags;
-    auto child_type =
-        llvm::expectedToStdOptional(tree_node_type.GetChildCompilerTypeAtIndex(
-            nullptr, 4, true, true, true, child_name, child_byte_size,
-            child_byte_offset, child_bitfield_bit_size,
-            child_bitfield_bit_offset, child_is_base_class,
-            child_is_deref_of_parent, nullptr, language_flags));
-    if (child_type && child_type->IsValid())
-      m_skip_size = (uint32_t)child_byte_offset;
+  size_t advance_by = idx;
+  if (idx > 0) {
+    // If we have already created the iterator for the previous
+    // index, we can start from there and advance by 1.
+    auto cached_iterator = m_iterators.find(idx - 1);
+    if (cached_iterator != m_iterators.end()) {
+      iterator = cached_iterator->second;
+      advance_by = 1;
+    }
   }
+
+  ValueObjectSP iterated_sp(iterator.advance(advance_by));
+  if (!iterated_sp)
+    // this tree is garbage - stop
+    return nullptr;
+
+  if (!m_node_ptr_type.IsValid())
+    return nullptr;
+
+  // iterated_sp is a __iter_pointer at this point.
+  // We can cast it to a __node_pointer (which is what libc++ does).
+  auto value_type_sp = iterated_sp->Cast(m_node_ptr_type);
+  if (!value_type_sp)
+    return nullptr;
+
+  // After dereferencing the __node_pointer, we will have a handle to
+  // a std::__1::__tree_node<void *>, which has the __value_ member
+  // we are looking for.
+  Status s;
+  value_type_sp = value_type_sp->Dereference(s);
+  if (!value_type_sp || s.Fail())
+    return nullptr;
+
+  // Finally, get the key/value pair.
+  value_type_sp = value_type_sp->GetChildMemberWithName("__value_");
+  if (!value_type_sp)
+    return nullptr;
+
+  m_iterators[idx] = iterator;
+
+  return value_type_sp;
 }
 
 lldb::ValueObjectSP
@@ -306,68 +309,16 @@ lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetChildAtIndex(
   static ConstString g_nc("__nc");
   uint32_t num_children = CalculateNumChildrenIgnoringErrors();
   if (idx >= num_children)
-    return lldb::ValueObjectSP();
-  if (m_tree == nullptr || m_root_node == nullptr)
-    return lldb::ValueObjectSP();
+    return nullptr;
 
-  MapIterator iterator(m_root_node, num_children);
-
-  const bool need_to_skip = (idx > 0);
-  size_t actual_advancde = idx;
-  if (need_to_skip) {
-    auto cached_iterator = m_iterators.find(idx - 1);
-    if (cached_iterator != m_iterators.end()) {
-      iterator = cached_iterator->second;
-      actual_advancde = 1;
-    }
-  }
-
-  ValueObjectSP iterated_sp(iterator.advance(actual_advancde));
-  if (!iterated_sp) {
-    // this tree is garbage - stop
-    m_tree =
-        nullptr; // this will stop all future searches until an Update() happens
-    return iterated_sp;
-  }
+  if (m_tree == nullptr || m_root_node == nullptr)
+    return nullptr;
 
-  if (!GetDataType()) {
+  ValueObjectSP key_val_sp = GetKeyValuePair(idx, /*max_depth=*/num_children);
+  if (!key_val_sp) {
+    // this will stop all future searches until an Update() happens
     m_tree = nullptr;
-    return lldb::ValueObjectSP();
-  }
-
-  if (!need_to_skip) {
-    Status error;
-    iterated_sp = iterated_sp->Dereference(error);
-    if (!iterated_sp || error.Fail()) {
-      m_tree = nullptr;
-      return lldb::ValueObjectSP();
-    }
-    GetValueOffset(iterated_sp);
-    auto child_sp = iterated_sp->GetChildMemberWithName("__value_");
-    if (child_sp)
-      iterated_sp = child_sp;
-    else
-      iterated_sp = iterated_sp->GetSyntheticChildAtOffset(
-          m_skip_size, m_element_type, true);
-    if (!iterated_sp) {
-      m_tree = nullptr;
-      return lldb::ValueObjectSP();
-    }
-  } else {
-    // because of the way our debug info is made, we need to read item 0
-    // first so that we can cache information used to generate other elements
-    if (m_skip_size == UINT32_MAX)
-      GetChildAtIndex(0);
-    if (m_skip_size == UINT32_MAX) {
-      m_tree = nullptr;
-      return lldb::ValueObjectSP();
-    }
-    iterated_sp = iterated_sp->GetSyntheticChildAtOffset(m_skip_size,
-                                                         m_element_type, true);
-    if (!iterated_sp) {
-      m_tree = nullptr;
-      return lldb::ValueObjectSP();
-    }
+    return nullptr;
   }
 
   // at this point we have a valid
@@ -375,7 +326,7 @@ lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetChildAtIndex(
   // all items named __value_
   StreamString name;
   name.Printf("[%" PRIu64 "]", (uint64_t)idx);
-  auto potential_child_sp = iterated_sp->Clone(ConstString(name.GetString()));
+  auto potential_child_sp = key_val_sp->Clone(ConstString(name.GetString()));
   if (potential_child_sp) {
     switch (potential_child_sp->GetNumChildrenIgnoringErrors()) {
     case 1: {
@@ -396,7 +347,6 @@ lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::GetChildAtIndex(
     }
     }
   }
-  m_iterators[idx] = iterator;
   return potential_child_sp;
 }
 
@@ -409,6 +359,9 @@ lldb_private::formatters::LibcxxStdMapSyntheticFrontEnd::Update() {
   if (!m_tree)
     return lldb::ChildCacheState::eRefetch;
   m_root_node = m_tree->GetChildMemberWithName("__begin_node_").get();
+  m_node_ptr_type =
+      m_tree->GetCompilerType().GetDirectNestedTypeWithName("__node_pointer");
+
   return lldb::ChildCacheState::eRefetch;
 }
 

@Michael137
Copy link
Member Author

Michael137 commented Jul 3, 2024

This is currently a bit hard to review because it's based off of other branches that are in review. So feel free to hold off on reviewing until the dependencies landed (though the meat of this patch is in the last commit)

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Jul 4, 2024
…xxMap.cpp

The two formatters follow very similar techniques to retrieve data out of the map. We're changing this for `std::map` in llvm#97579 and plan to change it in the same way for the iterator formatter. Having them in the same place will allow us to re-use some of the logic (and we won't have to repeat some of the clarification comments).
Michael137 added a commit that referenced this pull request Jul 4, 2024
…xxMap.cpp (#97687)

The two formatters follow very similar techniques to retrieve data out
of the map. We're changing this for `std::map` in
#97579 and plan to change it in
the same way for the iterator formatter. Having them in the same place
will allow us to re-use some of the logic (and we won't have to repeat
some of the clarification comments).
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
…xxMap.cpp (llvm#97687)

The two formatters follow very similar techniques to retrieve data out
of the map. We're changing this for `std::map` in
llvm#97579 and plan to change it in
the same way for the iterator formatter. Having them in the same place
will allow us to re-use some of the logic (and we won't have to repeat
some of the clarification comments).
@Michael137 Michael137 force-pushed the lldb/map-formatter-alternative-impl branch from 6e76c18 to 1107b88 Compare July 8, 2024 09:54
Depends on:
* llvm#97544
* llvm#97549
* llvm#97551

This patch tries to simplify the way in which the
`std::map` formatter goes from the root `__tree`
pointer to a specific key/value pair.

Previously we would synthesize a structure that
mimicked what `__iter_pointer` looked like
in memory, then called `GetChildCompilerTypeAtIndex`
on it to find the byte offset into that structure
at which the pair was located at, and finally
use that offset through a call to `GetSyntheticChildAtOffset`
to retrieve that pair. Not only was this logic hard to follow,
and encoded the libc++ layout in non-obvious ways, it was also
fragile to alignment miscalculations
(llvm#97443); this would
break after the new layout of std::map landed as part of
inhttps://github.com/llvm/issues/93069.

Instead, this patch simply casts the `__iter_pointer` to
the `__node_pointer` and uses a straightforward
`GetChildMemberWithName("__value_")` to get to the key/value
we care about. This allows us to get rid of some support
infrastructure/class state.

Ideally we would fix the underlying alignment issue, but
this unblocks the libc++ refactor in the interim,
while also benefitting the formatter in terms of readability (in my opinion).
@Michael137 Michael137 force-pushed the lldb/map-formatter-alternative-impl branch from 1107b88 to 223606f Compare July 8, 2024 10:08
Michael137 added a commit that referenced this pull request Jul 8, 2024
…o LibCxxUnorderedMap.cpp (#97752)

Similar to how we moved the `std::map::iterator` formatter in
#97687, do the same for
`std::unordered_map::iterator`.

Again the `unordered_map` and `unordered_map::iterator` formatters try
to do very similar things: retrieve data out of the map. The iterator
formatter does this in a fragile way (similar to how `std::map` does it,
see #97579). Thus we will be
refactoring the `std::unordered_map::iterator` in upcoming patches.
Having it in `LibCxxUnorderedMap` will allow us to re-use some of the
logic (and we won't have to repeat some of the clarification comments).
Michael137 added a commit that referenced this pull request Jul 8, 2024
…97713)

Depends on #97687

Similar to #97579, this patch
simplifies the way in which we retrieve the key/value pair of a
`std::map` (in this case of the `std::map::iterator`).

We do this for the same reason: not only was the old logic hard to
follow, and encoded the libc++ layout in non-obvious ways, it was also
fragile to alignment miscalculations
(#97443); this would break once
the new layout of std::map landed as part of
#93069.

Instead, this patch simply casts the `__iter_pointer` to the
`__node_pointer` and uses a straightforward
`GetChildMemberWithName("__value_")` to get to the key/value we care
about.

We can eventually re-use the core-part of the `std::map` and
`std::map::iterator` formatters. But it will be an easier to change to
review once both simplifications landed.
@Michael137 Michael137 merged commit fe8933b into llvm:main Jul 8, 2024
6 checks passed
@Michael137 Michael137 deleted the lldb/map-formatter-alternative-impl branch July 8, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants