Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes api client doesn't pass configuration object through to the models it creates, causing excessive logging calls #2284

Closed
gibsondan opened this issue Sep 20, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@gibsondan
Copy link

What happened (please include outputs or screenshots):
Making an API call with the kubernetes python client in an environment with a lot of multithreading and logging happening becomes very slow, and reveals that the bottleneck is each deserialized python model waiting to acquire the global python logging lock just to create the model object:

image

It appears that every time we create a model object here: https://github.com/kubernetes-client/python/blob/master/kubernetes/client/api_client.py#L620-L641

It results in a model being created without the "local_vars_configuration" argument set here (picking a model at random, but they all appear to have this parameter): https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/events_v1_event.py#L75-L78

Causing the model to create a fresh configuration object, which does several logger.setLevel calls in its constructor: https://github.com/kubernetes-client/python/blob/master/kubernetes/client/configuration.py#L257-L277

With enough threads doing this at once, serious contention and slowdown can result.

I believe an easy fix for this would be to pass through the Configuration object that the api client is already using through to each model object constructor in __deserialize_model, so that each model object does not need to create a new Configuration object.

What you expected to happen:
No logging locks (or at least fewer logging locks) being needed to be acquired just to deserialize a k8s API response

How to reproduce it (as minimally and precisely as possible):

On a cluster with at least one k8s deployment, this script demonstrates that a couple logger.getLevel calls are being made within __deserialize_model:

import logging
import traceback

import kubernetes.client as client
from kubernetes import config

# Load kubeconfig with a specific context
context_name = "your-context-here"

original_setLevel = logging.Logger.setLevel

call_count = {"count": 0}


def setLevel_with_stack_trace(self, level):
    # Print the stack trace
    print(f"Setting log level to: {logging.getLevelName(level)}")

    traceback.print_stack()

    tb_string = "".join(traceback.format_stack())

    if "__deserialize_model" in tb_string:
        call_count["count"] = call_count["count"] + 1

    # Call the original setLevel method
    original_setLevel(self, level)


logging.Logger.setLevel = setLevel_with_stack_trace


# Load the specified context from the kubeconfig file
config.load_kube_config(context=context_name)

with client.ApiClient() as api_client:
    api_instance = client.AppsV1Api(api_client)

    deployments = api_instance.list_namespaced_deployment(
        "your-namespace-here",
    ).items

    print(len(deployments))


print(f"Made {call_count['count']} setLogLevel calls inside __deserialize_model")

I ran this on a cluster with 33 deployments and it told me that there were 2948 logger.setLevel calls while deserializing the API response.

Anything else we need to know?:

Environment:

  • Kubernetes version (kubectl version):
    Client Version: v1.29.1
    Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
    Server Version: v1.29.7-eks-a18cd3a

  • OS (e.g., MacOS 10.13.6):

  • Python version (python --version)
    python 3.11.7

  • Python client version (pip list | grep kubernetes) 31.0.0

@gibsondan gibsondan added the kind/bug Categorizes issue or PR as related to a bug. label Sep 20, 2024
@gibsondan
Copy link
Author

Ah this may be a dupe of #1867

gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 20, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 20, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
@gibsondan
Copy link
Author

A workaround for this appears to be to patch the ApiClient as follows and use that everywhere that you might use an ApiClient (which makes me think the fix for this should be pretty straightforward, I'm just having some trouble ascertaining where exactly __deserialize_model is defined since it's generated via OpenAPI)



class PatchedApiClient(ApiClient):
    # Forked from ApiClient implementation to pass configuration object down into created model
    # objects, avoiding lock contention issues. See https://github.com/kubernetes-client/python/issues/2284
    def __deserialize_model(self, data, klass):
        """Deserializes list or dict to model.

        :param data: dict, list.
        :param klass: class literal.
        :return: model object.
        """
        if not klass.openapi_types and not hasattr(klass, "get_real_child_model"):
            return data

        # Below is the only change from the base ApiClient implementation - pass through the
        # Configuration object to each newly created model so that each one does not have to create
        # one and acquire a lock
        kwargs = {"local_vars_configuration": self.configuration}

        if data is not None and klass.openapi_types is not None and isinstance(data, (list, dict)):
            for attr, attr_type in six.iteritems(klass.openapi_types):
                if klass.attribute_map[attr] in data:
                    value = data[klass.attribute_map[attr]]
                    kwargs[attr] = self.__deserialize(value, attr_type)

        instance = klass(**kwargs)

        if hasattr(instance, "get_real_child_model"):
            klass_name = instance.get_real_child_model(data)
            if klass_name:
                instance = self.__deserialize(data, klass_name)
        return instance

@gibsondan
Copy link
Author

Closing since this is a duplicate.

@gibsondan
Copy link
Author

#1921 appears to also be relevant and has a similar root cause (old version of the openapi generator being used)

gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 23, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 23, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 26, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
gibsondan added a commit to dagster-io/dagster that referenced this issue Sep 26, 2024
Unclear whether the benefits are worth the risk of breaking changes here, but this is a workaround for the signifcant lock contention issues with the k8s client surfaced in #23933 (reply in thread). This appears to be a known issue with the k8s python client - see kubernetes-client/python#2284.

Test Plan: BK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant