Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue #3897] Get the diff of two nested dict #3968

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

babebe
Copy link
Collaborator

@babebe babebe commented Feb 21, 2025

Summary

Fixes #{3897}

Time to review: 5 mins

Changes proposed

Util Function: Given two complex nested dictionaries can provide a set of changes between the two
Test Added

v_b = flatt_dict2.get(k, None)

# Store diff if values are different
if v_a != v_b:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to treat lists a bit differently and not worry about the order.

For example, the opportunity assistance listing number might trip this logic up a bit. For example, let's assume an opportunity looks like:

{
    "opportunity_assistance_listings": [
      {
        "assistance_listing_number": "43.012",
        "program_title": "Space Technology"
      },
      {
        "assistance_listing_number": "43.001",
        "program_title": "Science"
      }
    ],
    ... The rest of the opp
}

We need to be careful that this doesn't cause any issues. Imagine if when we fetch an opportunity from the DB, these assistance listing numbers come in a different order (but still the same two) - we wouldn't actually want that to be flagged as different.

One way we could do that is something like:

if isinstance(v_a, list):
    if set(v_a) != set(v_b):
         diff = ...
         diffs.append(diff)
elif v_a != v_b:
     # as you already have it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yes that would have been flagged as such. Added the check for lists

@babebe babebe requested a review from freddieyebra as a code owner February 25, 2025 21:35

# Store diff if values are different
if isinstance(v_a, list):
if set(v_a) != set(v_b):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One scenario to account for - if v_b is None this would error since set(None) isn't valid.

I think the safest way to deal with that is making a utility function to handle if that errors and make it an empty set

Copy link
Collaborator Author

@babebe babebe Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if iterable first would also do the trick

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created helper func to convert to set before comparing

@babebe babebe requested a review from chouinar February 27, 2025 16:52
Comment on lines 66 to 67
v_a = _convert_to_set(values[0])
v_b = _convert_to_set(values[1])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if when I quickly said this earlier, I'd still want to check if these are iterables before converting to a set (ie. if they're both integers, leave them as integers). Otherwise we have to make sure everything we ever feed through this is hashable into a set.

How about this - adjust/rename the _convert_to_set method to have an implementation like:

if isinstance(data, (list, tuple)): # not iterable - see note
   return set(data)

return data

While the iterable check seems decent, you found that strings already have a problem that you have to build in an exception for. But bytes, dicts, strings, and a few other types end up needing special handling - it's probably easier to go without it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants