Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postman workspace enumeration #3925

Merged
merged 8 commits into from
Feb 21, 2025
Merged

Conversation

casey-tran
Copy link
Contributor

@casey-tran casey-tran commented Feb 20, 2025

Description:

When not given any specific Postman workspaces in the command line, the scanner will make a call to EnumerateWorkspaces() but the request in that function to "https://api.getpostman.com/workspaces" would only return the workspaces' info but not go deeper into the workspaces' tree to get their collections or environments. I added a loop to go through the list of workspaces and make individual requests to thoroughly populate the Workspace struct with the appropriate data.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@casey-tran casey-tran marked this pull request as ready for review February 20, 2025 23:27
@casey-tran casey-tran requested review from a team as code owners February 20, 2025 23:27
Comment on lines 280 to 283
workspacesObj.Workspaces[i], err = c.GetWorkspace(workspace.ID)
if err != nil {
err = fmt.Errorf("could not get workspace during enumeration: %s (%s)", workspace.Name, workspace.ID)
return workspaces, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just discussing here; no need to change anything)

You'll learn this about me, but I assiduously shave off cognitive load, so prepare to be roped into my compulsions 😅.

First, these two things:

workspacesObj.Workspaces[i], err = c.GetWorkspace(workspace.ID)

and

return workspaces, err

I worry a little about the 1st, because we're mutating workspacesObj.Workspaces before checking if GetWorkspace returned an error. If we assign it to a--basically temporary--local variable then in an error condition workspacesObj.Workspaces remains untouched, and there's no way wacky behavior in GetWorkspace can corrupt our state.

The 2nd is a pattern throughout this source, where there's a zero value defined effectively only to avoid initialization? But it's really not so bad to just do return nil, err or return Workspace{}, err, and I like it better because it's not possible for something else to have accidentally modified the zero error value.

So back to reducing cognitive load, I read workspacesObj.Workspaces[i], err = c.GetWorkspace(workspace.ID) and thought, "huh, will GetWorkspace ever return something funky?", then I read that function and saw var workspace Workspace up top but didn't realize it was a zero value for error conditions, and then got real confused when it was never mutated, then realized it was a zero value for error conditions, then realized I needed to really check it was never (accidentally or not) mutated. Whew!

And like, it's not a big deal. We do an error check right after this assignment so who cares if it gets corrupted. Mostly all I'm saying is that some small changes let you be sure something is fine, vs having to have faith that everyone who's worked on this code followed a pattern 100% and made no mistakes.


This is kind of what I was talking about the other day when I was going on about learning to work fast without stuff like tests or code review or whatever. The way we'd be sure all this worked like we thought is to write unit tests, inject errors, review the code lots, think hard about it, etc. But if you adopt patterns that make weird issues impossible, you can skip all that. Of course we won't, but it still lets us move a lot faster and get more for our brain cycles.


Finally; check w/ @rosecodym but do we want to error out here? Is it fine if we can't get a workspace, or should we in fact bail?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. The benefit of bailing early is that problems are immediately apparent (which means that secrets don't get inadvertently missed, and debugging is easier). The downside is that sometimes users want the program to keep trucking along when non-fatal errors are encountered. When scans take a long time, the second consideration dominates (you don't want a ten-hour GitHub scan to terminate early because of a network glitch), but Postman scans run so fast that it's less clear-cut. I don't have a strong opinion, so I'll defer to the author here :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I fully agree with @camgunz's other thoughts about the unused workspaces local. It created extra, unnecessary cognitive load for me, too - I was expecting simple nil returns, which are fine. (I am wondering if this was an artifact of some previous change that just never got cleaned up.)

@@ -185,6 +185,7 @@ func (s *Source) Chunks(ctx context.Context, chunksChan chan *sources.Chunk, _ .
return fmt.Errorf("error getting workspace %s: %w", workspaceID, err)
}
s.SetProgressOngoing(fmt.Sprintf("Scanning workspace %s", workspaceID), "")
ctx.Logger().V(2).Info("scanning workspace from workspaces given", "workspace", workspaceID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey great logging 👍🏻

Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find! I'm requesting changes because of the use of the background logger, but that's just to ensure it doesn't get overlooked because this PR was already approved.

@@ -307,7 +309,7 @@ func (s *Source) scanWorkspace(ctx context.Context, chunksChan chan *sources.Chu
// scanCollection scans a collection and all its items, folders, and requests.
// locally scoped Metadata is updated as we drill down into the collection.
func (s *Source) scanCollection(ctx context.Context, chunksChan chan *sources.Chunk, metadata Metadata, collection Collection) {
ctx.Logger().V(2).Info("starting scanning collection", collection.Info.Name, "uuid", collection.Info.UID)
ctx.Logger().V(2).Info("starting to scan collection", "collection name", collection.Info.Name, "collection uuid", collection.Info.UID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We snake_case our logging keys by convention.

@@ -250,6 +252,7 @@ func (c *Client) getPostmanReq(url string, headers map[string]string) (*http.Res
// EnumerateWorkspaces returns the workspaces for a given user (both private, public, team and personal).
// Consider adding additional flags to support filtering.
func (c *Client) EnumerateWorkspaces() ([]Workspace, error) {
context.Background().Logger().V(2).Info("enumerating workspaces")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to log from this function - but that means it needs a "real" logger, not the background context logger. The solution is to add a context parameter to this function. (I know that using the background logger works in practice here, but that's because of a fail-safe elsewhere. It's safer to not rely on that fail-safe, because it might change someday.)

Comment on lines 280 to 283
workspacesObj.Workspaces[i], err = c.GetWorkspace(workspace.ID)
if err != nil {
err = fmt.Errorf("could not get workspace during enumeration: %s (%s)", workspace.Name, workspace.ID)
return workspaces, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. The benefit of bailing early is that problems are immediately apparent (which means that secrets don't get inadvertently missed, and debugging is easier). The downside is that sometimes users want the program to keep trucking along when non-fatal errors are encountered. When scans take a long time, the second consideration dominates (you don't want a ten-hour GitHub scan to terminate early because of a network glitch), but Postman scans run so fast that it's less clear-cut. I don't have a strong opinion, so I'll defer to the author here :)

Comment on lines 280 to 283
workspacesObj.Workspaces[i], err = c.GetWorkspace(workspace.ID)
if err != nil {
err = fmt.Errorf("could not get workspace during enumeration: %s (%s)", workspace.Name, workspace.ID)
return workspaces, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I fully agree with @camgunz's other thoughts about the unused workspaces local. It created extra, unnecessary cognitive load for me, too - I was expecting simple nil returns, which are fine. (I am wondering if this was an artifact of some previous change that just never got cleaned up.)

@casey-tran
Copy link
Contributor Author

Thank you for the thorough review @camgunz and @rosecodym. I agree with your sentiments and made the return value on errors explicit. I'd say the Postman scans are fast enough that bailing early is fine.

Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work!

} else {
workspacesObj.Workspaces[i] = tempWorkspace
}
ctx.Logger().V(3).Info("individual workspace getting added to the array", "workspace", workspacesObj.Workspaces[i])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor point: the thing being modified is a slice, not an array. golang arrays are pretty rare in our code.

for i, workspace := range workspacesObj.Workspaces {
tempWorkspace, err := c.GetWorkspace(ctx, workspace.ID)
if err != nil {
err = fmt.Errorf("could not get workspace during enumeration: %s (%s)", workspace.Name, workspace.ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new error doesn't actually contain the original error, which might be useful for debugging. here's an alternative way to write this that would wrap it:

if err != nil {
    return nil, fmt.Errorf("could not get workspace %q (%s) during enumeration: %w", workspace.Name, workspace.ID, err)
}

Comment on lines 283 to 285
} else {
workspacesObj.Workspaces[i] = tempWorkspace
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the if case returns, so you don't need else:

if err != nil {
    return nil, fmt.Errorf("whatever")
}
workspacesObj.Workspaces[i] = tempWorkspace

@casey-tran casey-tran merged commit 6f1e918 into main Feb 21, 2025
13 checks passed
@casey-tran casey-tran deleted the postman-workspace-enumeration branch February 21, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants