-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace raw config dicts
with TypedDicts
#18
Conversation
by introducing TypedDicts for K8s, Docker, and Ray ResourceOptions. The typedDicts are iteratively contstructed based on available config, thereby eliminating the need for removing None values. Also introduce a ResourceKind enum for the K8sResourceOptions.
src/jobs/job.py
Outdated
def to_docker(self) -> DockerResourceOptions: | ||
options: DockerResourceOptions = DockerResourceOptions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means:
"Instead of filtering everything for "not None
", we initialize the typed dict empty (!) and populate it with all not-Nones"?
If so, this is unexpected - TypedDicts are explicitly there for schema-like guarantees on dict objects (which includes key presence), and this approach subverts that crucial expectation.
Please clarify if I misunderstood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous code, there were cond if cond else None
conditionals for all sections of the dict as well, followed by a util function that deleted all keys with None
values.
That is the reason i went with this implementation. @AdrianoKF , what was the initial rationale for having these conditionals?
The TypedDict
implementation is not dependent on it, and we can refactor it to use the fitting keys.
Currently all keys are optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it, but the point of TypedDict is that its keys are not optional (as in, they can be typed T | None
but MUST be present).
Which means that for an interface expecting this schema:
class Options(TypedDict):
cpu: str | None
memory: str | None
image: str | None
passing in a value of {"cpu": "arm64"}
is an error (all three keys must be present).
If we want to go the TypedDict route, we need to remove the None values just before the API call that these structs are passed to.
... to be compatible with Python 3.9. The NotRequired was only in- troduced in Python 3.11.
but the filling value can be `None`. Then, we remove the `None`s by reintroducing the `remove_none_values` function (and removing the associated commit as it was on the branch tip), but add generic typing to it.
src/jobs/job.py
Outdated
"cpu": self.cpu if self.cpu else None, | ||
"memory": self.memory if self.memory else None, | ||
"nvidia.com/gpu": self.gpu if self.gpu else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're not conditionally accessing things, you can relax these statements to e.g. self.cpu or None
.
src/jobs/job.py
Outdated
options: RayResourceOptions = { | ||
"entrypoint_memory": int(to_rational(self.memory)) if self.memory else None, | ||
"entrypoint_num_cpus": int(to_rational(self.cpu)) if self.cpu else None, | ||
"entrypoint_num_gpus": self.gpu if self.gpu else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, self.gpu or None
. Not for the two above though, because to_rational
might not like None values.
as the evaluation of the functional form is not deferred and using future imports has no effect. Thus use Union to be compatible with Python 3.9
src/jobs/job.py
Outdated
K8sRequestResourceOptions = TypedDict( | ||
"K8sRequestResourceOptions", | ||
{ | ||
"cpu": Union[str, None], | ||
"memory": Union[str, None], | ||
"nvidia.com/gpu": Union[int, None], | ||
}, | ||
) | ||
|
||
K8sLimitResourceOptions = TypedDict( | ||
"K8sLimitResourceOptions", | ||
{ | ||
"nvidia.com/gpu": Union[int, None], | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are actually the same types, there's no fundamental distinction between resource requests and limits.
Also, we should pick one style of type hint for unions and stick to it (at least in the same file), so I would replace Union
with |
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the functional form the future import substitution does not work, that is why i went with Union
. We need the functional form for the keys which contain special characters. So if we want consistency, i would change the rest to Union
as well.
I implemented the two instances based on @nicholasjng remarks above that in TypedDicts
keys should not be optional. We could achieve this and have only a single K8sResourceOptions
using the total=False
kwarg, (NotRequired
is incompatible with Python 3.9).
Should I merge it again into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh. I'm ready to toss 3.9 into the abyss, to be honest. There's so many accommodations just for its shortcomings, and it's gonna be EOL next year. @nicholasjng @janwillemkl , opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open a new PR to discuss the migration to Python 3.10 as oldest supported Python shortly.
Necessary changes (maybe not exhaustive) would be:
- Reintroduce Slots
- Use pipes instead of Union
- Use Argtypes instead of Union types
- Update test on oldest supported Python
Edit: Adressed in #21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documenting our huddle discussion results here:
- It does not reflect the reality of the Resource Options that all keys must be set. So I am making them optional where applicable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with throwing out 3.9 as well. Slots are imo unnecessary for reasons already given (we are operating with O(10^1) dataclasses at a time and not O(10^6).
as there is no conceptual difference within the two. As they may use different resources (in the limit and request) we do make the parameters optional. This also reflects the code logic which filters `None` values. We now also handle LIMIT and REQUEST types the same. This commit leaves a TODO that we have to decide if separate logic is warrented or not. Until then we do not remove the logical branches from the code, but right as of this commit both branches act the same.
Closes #17