Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pseudo-code to latest design #28

Merged
merged 7 commits into from
Feb 4, 2025

Conversation

nguerrera
Copy link
Contributor

Also move the pseudo-code from devdocs to docs.

@nguerrera nguerrera added the no-merge Block PR from merging label Jan 23, 2025
@nguerrera
Copy link
Contributor Author

nguerrera commented Jan 23, 2025

I will wait for @michaelcfanning to review before merging and also for the matching implementation in #27 to be approved.

1. Let N = number of bytes in the provider data of the CASK secret. (N may be 0 if there is no provider data.)
1. Allocate storage for the CASK hash:
- 33 bytes for padded HMACSHA256 or 48 bytes for HMACSHA384
- N bytes for provider data
Copy link
Member

@michaelcfanning michaelcfanning Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data

maybe add a comment here (because the CASK primary key has been determined to be valid, 'N' will comprise a number that is divisible by 3 #Closed

# GenerateKey Pseudo-Code

1. Validate input. Return an error if any of the following are NOT true:
- Provider signature is exactly 4 characters long.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signature

if we preserve HIS v2 conventions, the leading character of every provider id must be an alpha character. Furthermore, this alpha character establishes a casing for the provider id that must be consistent with any other alpha chars in the signature. A signature with a leading upper-case character comprises a user-managed key (and all other chars in the sig must be upper-case). A signature with a leading lower-case char comprises a service-managed key (and all other chars in the sig must also be lowercase).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, should we preserve these? I found the upper and lower case thing confusing when I touched something related for the first time in HISv2, FWIW.

The current description matches the current code. And the current code was chosen by me when I discovered that the initial draft did no validation.

If we want to change this, I think it would be best to do it in its own PR that updates the code, tests, and pseudo-code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So, this convention is important/useful for providers but it doesn't have to be a requirement in the standard.

The core problem is ownership/remediation accountability. When a user-managed key is spilled, we want the user to be informed/to remediate. When a service-managed key is spilled, we obviously want to alert the service provider and the provider will benefit from time/air cover if possible to drive remediation.

Imagine a secret spilled into public OSS. For a user-managed key, you would reasonably inform the repo owner of the exposure. For a server managed secret, you might not want that kind of alerting, instead, you'd want a direct notification sent to the platform only (GitHub sends these notices to service owners today).

So let's check in what you have, and perhaps we have some supplemental content in our repo for advanced/nuanced extensions to the core requirements. How to encode a key lifetime is another example where some suggested techniques might make sense without having a hard standard.

1. Validate input. Return an error if any of the following are NOT true:
- Provider signature is exactly 4 characters long.
- Provider signature consists entirely of characters that are valid in base64url encoding.
- Provider data (if any) has a length that is a multiple of 3 characters and no more than 32 characters.
Copy link
Member

@michaelcfanning michaelcfanning Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

characters

are we working in bytes or characters for this description, I'm getting a little confused.

if the provider data is chars, the length is a multiple of 4, not more than 20 encoded chars. if it is bytes, the length is a multiple of 3, no more than 15 bytes. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a bug here with 3 and 4 inverted, but the limits in the code are currently, 32 chars / 24 bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is correct. Validate my thinking.

33 random + 6 signatures + 3 timestamp + 6 (reserved + checksum)
That's 48 bytes. Now, we have a natural limit on this key of 64 bytes, otherwise the key risks triggering HMAC hashing on use as an initializer. Because of our 3 byte alignment, our natural limit is 63 bytes. 63 - 48 bytes == 15 byte limit on optional data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am avoiding changing the code in this change. I believe what I said is correct with respect to the current implementation. I will change the limit in code and pseudo-code in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, let's get this in and keep iterating.

@nguerrera nguerrera force-pushed the pseudo-code-to-docs branch from d05a243 to 340e26a Compare February 4, 2025 17:10
Copy link
Member

@michaelcfanning michaelcfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@nguerrera nguerrera removed the no-merge Block PR from merging label Feb 4, 2025
@@ -0,0 +1,46 @@
# GenerateKey Pseudo-Code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These keys/algorithm pack a lot in there. That's not necessarily a bad thing, but one thing missing from the documentation is the why behind a lot of these choices.

Depending on required properties of the keys, the simplest algorithm is:

<provider_id>_<crypt rand hex>_<checksum>

That's easy for someone to implement (efficiently), verify implementation, and find keys all within a few minutes compared with this pseudocode.

If I'm a stranger first encountering this spec and I look and see the "complicated" bits about timestamp embedding, I might go: "do I really need that? why is that useful? what scenarios do I use that in? Why is that required? If I'm a provider can and I need that, can I just put it in the reserved provider bits?"

It's out of scope for this PR, but another document to supplement the pseudocode would be one that rationalizes each component of the key and any major choices made along the way.

This is not suggesting to remove anything, just encouraging documenting the choices and scenarios before they are lost to time.

@nguerrera nguerrera merged commit 91c3102 into microsoft:main Feb 4, 2025
8 of 9 checks passed
@nguerrera nguerrera deleted the pseudo-code-to-docs branch February 4, 2025 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants