Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubedirectorCluster Connections #283

Merged
merged 82 commits into from
Apr 22, 2020

Conversation

kmathur2
Copy link
Member

@kmathur2 kmathur2 commented Mar 11, 2020

What are the Connections ?

Currently, there are 2 types of connections -

Cluster connections
Configmap connections

Cluster connections - When we create a kubedirectorcluster we generate "configmeta" as part of cluster creation which has all cluster related metadata. There are tools like configcli and bdvcli to query this metadata from within a kubedirectorcluster pod. As part of this feature if some other cluster pod wants to query metadata from another cluster we use this connection feature. As part of the feature, kubedirectorapp can define the allowed attachment condition (example cluster category, distro_id, etc) based on which a given running cluster could be connected to the current cluster. As part of creating a new kubedirectorcluster the user can specify a list of kubedirectorcluster to be connected to this cluster's spec. The kdcluster reconciler will then dump the "cluster to be attached"'s configmeta in the new cluster and it could be then queried using conflicli from within the current cluster's pod.

Configmap connections - As part of the current cluster spec a list of configmaps (model, external auth, Kerberos, etc) could also be connected. A connected configmap will be propagated inside the running pods of the parent kdcluster and could then be queried using bdvcli or similar tools.

Example of connections at play, as part of kdcluster spec

spec:
  connections:
    configmaps:
    - "model-prediction"
    - "noise-recognition"
    clusters:
    - "spark-logs-nightly"
    - "kafka-streaming"

Change Summary:
In KubeDirectorAppSpec struct added a new field

  • ConnectableTo: This enumerates a list of allowed categories for connectable clusters

In KubeDirectorClusterSpec struct added 2 new fields

  • ConfigMetaGenerator: Everytime a new connection is added/deleted this is bumped up
  • Connections: New struct that carries a list of config maps and clusters to be connected

In KubeDirectorClusterStatus struct added 1 new field

  • LastConfigMetaGenerator: After every reconfig, this captures the last value of
    ConfigMetaGenerator

Added reconciler for config maps to watch on any changes to the config map in the cluster namespace
Every time connected configmap or cluster is updated in the spec configMetaGenerator is incremented.
The cluster reconciler periodically looks for the value of configMetaGenerator and compares against lastConfigMetaGenerator, if the value has changed then it triggers regeneration of configmeta and recreats configmeta.json file for all pods in the running cluster
configMetaGenerator will be incremented every time a connection is added/removed.

FollowUp tasks

  • Need to add guesconfig hooks for configmap update
  • Finalize on connectable_to property
  • Reconciler refactoring using Enque fun
  • Hash approach instead of configMeta incrementer
  • Settle if configmap should even be connection or can be mounted like secret or vice versa
  • Finalize on v1Beta2 API

Copy link
Member

@riteshja riteshja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the example app CRs are using the attachable_to property yet in this PR. Right?

@joel-bluedata
Copy link
Member

The kdcluster CRD needs to be updated with the new spec and status properties.

//Notify cluster by incrementing configmetaGenerator
wait := time.Second
maxWait := 4096 * time.Second
for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process of

  • read cluster object
  • increment the observed gen number in spec
  • write cluster object

Needs to be inside this for loop. That's because the most likely reason for this update to fail is because of a conflict with some other write... so, we will need to read the cluster object again to get a copy of the object that includes the new updated resource version.

Also if we want to guarantee that we actually are bumping the spec gen by 1, then that's another reason to have the whole read-modify-write process inside the loop here.

As a side note: you COULD sidestep this looping by using PATCH, now that we have moved to a new version of the SDK that supports PATCH. However if you don't want to dig into that right now I don't blame you. I think it's more straightforward at the moment just to move the read-modify-write inside the for loop.

if shared.Update(context.TODO(), updateMetaGenerator) == nil {
break
}
if wait > maxWait {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment about the for loop here and read-modify-write.

@joel-bluedata
Copy link
Member

Done w/ review pass. We're in the home stretch!

@kmathur2

This comment has been minimized.

@joel-bluedata
Copy link
Member

Ah I see. Sorry! Still, it might be good to have it say which connection changed, which might be easier to do in the other spot.

BTW typo "chamge" there.

@kmathur2

This comment has been minimized.

@kmathur2
Copy link
Member Author

The kdcluster CRD needs to be updated with the new spec and status properties.

Won't this go in v1Beta2 CRD when we finalize that?

@joel-bluedata
Copy link
Member

Eh, yeah. IMO it's nice to express it here too though, just so this whole change is self-consistent. Not a big deal if you'd rather wait (assuming we get the v1beta2 change in soon).

@joel-bluedata
Copy link
Member

We are doing in both places

We're posting an event against foo that says foo is affecting bar, and we're posting an event against bar that says "I'm being affected by... something". What I'm suggesting is that the event on bar should ideally say that foo caused the change.

@kmathur2

This comment has been minimized.

@joel-bluedata
Copy link
Member

That's why I suggested posting that event in the other code.

@kmathur2
Copy link
Member Author

That's why I suggested posting that event in the other code.

At the other place (foo) already have - "configmap {%s} is a connection to cluster {%s}, updating its configmeta" , does this not suffice !!! I am soo confused

@joel-bluedata
Copy link
Member

Taking event-posting-chat to Slack for a bit. :-)

@joel-bluedata
Copy link
Member

Once those pesky events are sorted I think we're good to go. I'll do a test build before merging to master... let me know if you have some example cluster CRs I can play with.

@joel-bluedata
Copy link
Member

The comment above about event posting in configmap.go is still unresolved FYI.

@joel-bluedata joel-bluedata merged commit 0ff14df into bluek8s:master Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants