Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reflector should not request the whole meta #1389

Closed
SOF3 opened this issue Jan 15, 2024 · 8 comments · Fixed by #1393
Closed

Reflector should not request the whole meta #1389

SOF3 opened this issue Jan 15, 2024 · 8 comments · Fixed by #1393

Comments

@SOF3
Copy link
Contributor

SOF3 commented Jan 15, 2024

Would you like to work on this feature?

yes

What problem are you trying to solve?

I want to reduce controller memory usage by only storing the raw JSON of dynamic objects in memory instead of allocating the entire serde_json::Value. Currently the reflector store expects a type parameter with bound K: Resource, where Resource has to provide (mutable) access to an ObjectMeta it caches internally. This makes it impossible to implement Resource on a JSON string wrapper.

I do not want to extract out the entire metadata because the unused fields like annotations and labels may allocate a lot of heap memory and waste even more on pointers.

Describe the solution you'd like

Not all fields in ObjectMeta are required for the reflector. In fact, a quick git grep 'meta(' kube-runtime/src/reflector reveals that the all fields used from the ObjectMeta is the namespace, resourceVersion and uid (and read-only).

I am proposing that we change the bound of K: 'static + Resource in reflector::Store to this:

pub trait ReflectorResource : 'static {
    type Object: Send + Sync + 'static;
    fn namespace(object: &Self::Object) -> &str;
    fn name(object: &Self::Object) -> &str;
    fn resource_version(object: &Self::Object) -> Option<&str>;
    fn uid(object: &Self::Object) -> Option<&str>;
}

impl<K: Resource + 'static> ReflectorResource for K {
    type Object = K::DynamicObject;
    // other functions delegate to K::meta()
}

Describe alternatives you've considered

In the equivalent Go informer implementation, I implemented metav1.Object methods by returning zero values in unused getters and no-op in unused setters (which is obviously an undocumented hack), but the same trick is not possible to implement soundly due to meta_mut().

Documentation, Adoption, Migration Strategy

This is possibly a BC break since the bounds of reflector::Store are widened, but should cause no impact to most users since all existing Resource implementors are blanket-impled for.

Target crate for feature

kube-runtime

@clux
Copy link
Member

clux commented Jan 15, 2024

There might be some good points in here about making the bounds of the relfector's operation explicit, but from a purely memory optimisation POV, it feels a little unnecessary at the moment because of existing optimisations laid out in: https://kube.rs/controllers/optimization/#pruning-fields - have you seen this?

The default behaviour of the reflector is to store everything it sees, and personally i think is a sensible default, because everything you see is ultimately going on the wire anyway.

@clux
Copy link
Member

clux commented Jan 15, 2024

(this might not work for you if you absolutely need to implement Resource yourself on a json wrapper - but maybe it's worth asking if that is necessary first)

@SOF3
Copy link
Contributor Author

SOF3 commented Jan 15, 2024

Yep, pruning managedFields would probably help reduce half of the memory used, but I actually wanted to cache the entire generic object to execute some user-defined query operations, which still requires the entire object. Not sure how much this costs in Rust, but my experience in Go was that the unstructured object memory footprint is around four times as large as the raw JSON list.

@clux
Copy link
Member

clux commented Jan 15, 2024

If you still need to cache the entire object, then the memory cost should be the cost of storing the full thing + the cost of getting a (possibly smaller) copy, no? In fact, if you are using Store::get you're getting a shared reference to the same object through an Arc so you don't even have to pay for it twice.

We don't really do unstructured objects the same way as Go, so you should not see any particular memory gains by working on serialised data - rust is pretty efficient with its memory layout in general.

The things we generally use instead of unstructured is DynamicObject (if we care at all about the spec) or PartialObjectMeta (if we only care about labels/annots/name/kind so we can use metadata watch api).

@SOF3
Copy link
Contributor Author

SOF3 commented Jan 15, 2024

Why would there be the cost of storing a copy? I assume I could deserialize the JSON into something like a struct { name: Cow<'data, str> }, which could be converted into an Either<ArcStr, Range<usize>> (or Range<NonZeroU32> if we want to optimize further, since etcd cannot store objects larger than a few megabytes anyway and string literals never start at the first byte). And since the required extra fields are typically alphanumeric/hyphen, this generally means I just need 8*4 bytes additional bytes for each object, plus a usize to store the ArcStr at which the object is cloned into.

As for the part of getting the object, most objects are only retrieved once or twice in practice for many applications, so we're just deferring the cost of deserialization from list/watch time to processing time.

Could you elaborate further on "not doing unstructured objects the same way"? In the particular case I am trying to use right now, I need to pass the JSON to another library across an FFI boundary anyway, so getting the raw JSON is actually my intention, not just for memory optimization but also to avoid needing to reserialize it to pass the FFI. But I admit that FFI is a bit niche in the kube case so let's not focus on that.

@nightkr
Copy link
Member

nightkr commented Jan 15, 2024

FWIW, the overhead cost of a blank ObjectIdentity is 344 bytes, compared to 96 bytes for a bare minimum impl of your ReflectorResource:

use std::mem::size_of;

use kube::core::ObjectMeta;

struct ObjectIdentity {
    namespace: Option<String>,
    name: String,
    resource_version: String,
    uid: String,
}

fn main() {
    dbg!(size_of::<ObjectMeta>()); // => 344
    dbg!(size_of::<ObjectIdentity>()); // => 96
}

That is... surprisingly non-neglegible, actually.

@SOF3

This comment was marked as outdated.

@nightkr
Copy link
Member

nightkr commented Jan 15, 2024

We can always add the stricter bound to ::erase().

SOF3 added a commit to SOF3/kube-rs that referenced this issue Jan 15, 2024
SOF3 added a commit to SOF3/kube-rs that referenced this issue Jan 15, 2024
SOF3 added a commit to SOF3/kube-rs that referenced this issue Feb 4, 2024
SOF3 added a commit to SOF3/kube-rs that referenced this issue Feb 11, 2024
Motivation explained in kube-rs#1389

Depends on kube-rs#1393 to ease reflector bounds
SOF3 added a commit to SOF3/kube-rs that referenced this issue Feb 11, 2024
Motivation explained in kube-rs#1389

Depends on kube-rs#1393 to ease reflector bounds
@SOF3 SOF3 mentioned this issue Feb 11, 2024
SOF3 added a commit to SOF3/kube-rs that referenced this issue Feb 12, 2024
Motivation explained in kube-rs#1389

Depends on kube-rs#1393 to ease reflector bounds

Signed-off-by: SOFe <sofe2038@gmail.com>
@clux clux closed this as completed in #1393 Mar 5, 2024
clux added a commit that referenced this issue Mar 5, 2024
…1393)

* Ease the bound for `reflector` to only request identifying metadata

Fixes #1389

Signed-off-by: SOFe <sofe2038@gmail.com>

* rename ToObjectRef to reflector::Lookup

Signed-off-by: SOFe <sofe2038@gmail.com>

* Apply suggestions from code review

Co-authored-by: Eirik A <sszynrae@gmail.com>
Signed-off-by: Jonathan Chan Kwan Yin <sofe2038@gmail.com>

* Remove obsolete clippy warning suppression

Signed-off-by: Jonathan Chan Kwan Yin <sofe2038@gmail.com>

---------

Signed-off-by: SOFe <sofe2038@gmail.com>
Signed-off-by: Jonathan Chan Kwan Yin <sofe2038@gmail.com>
Co-authored-by: Eirik A <sszynrae@gmail.com>
SOF3 added a commit to SOF3/kube-rs that referenced this issue Mar 9, 2024
Motivation explained in kube-rs#1389

Depends on kube-rs#1393 to ease reflector bounds

Signed-off-by: SOFe <sofe2038@gmail.com>
SOF3 added a commit to SOF3/kube-rs that referenced this issue May 8, 2024
Motivation explained in kube-rs#1389

Depends on kube-rs#1393 to ease reflector bounds

Signed-off-by: SOFe <sofe2038@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants