Skip to content
This repository was archived by the owner on Jan 11, 2023. It is now read-only.

Error created Azure master nodes using azure CNI #1916

Closed
cloudopsguy opened this issue Dec 11, 2017 · 5 comments · Fixed by #1966
Closed

Error created Azure master nodes using azure CNI #1916

cloudopsguy opened this issue Dec 11, 2017 · 5 comments · Fixed by #1966

Comments

@cloudopsguy
Copy link

cloudopsguy commented Dec 11, 2017

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

What version of acs-engine?:
0.10.0

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes 1.8

What happened:
When trying to deploy for a small Azure based docker config something strange is occurring with master nodes network interfaces.
The errors I am encountering are (subscription data stripped)

:

IP configuration /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-2/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.6 which is already allocated to resource /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-1.
and
IP configuration /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-0/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.4 which is already allocated to resource /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-1.

What you expected to happen:
I have no idea why multiple interfaces are attempting to take the same IP. I would think they should take unique IP's

How to reproduce it (as minimally and precisely as possible):
I'm not sure, I have attached part of my model for reference.
{
"apiVersion": "vlabs",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Kubernetes",
"orchestratorRelease": "1.8",
"kubernetesConfig": {
"networkPolicy": "azure"
}
},
"masterProfile": {
"count": 3,
"dnsPrefix": "syn-prod",
"vmSize": "Standard_D2_v2",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/k8s-Master-Net",
"firstConsecutiveStaticIP": "10.50.1.4",
"ipAddressCount": 5,
"distro": "coreos"
},
"agentPoolProfiles": [
{
"name": "publicpool",
"count": 3,
"vmSize": "Standard_D2_v2",
"OSDiskSizeGB": 200,
"storageProfile" : "ManagedDisks",
"availabilityProfile": "AvailabilitySet",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/PublicAgentPool",
"ipAddressCount": 7,
"distro": "coreos"
},
{
"name": "privatepool",
"count": 10,
"vmSize": "Standard_D2_v2",
"OSDiskSizeGB": 200,
"storageProfile" : "ManagedDisks",
"availabilityProfile": "AvailabilitySet",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/PrivateAgentPool",
"ipAddressCount": 10,
"distro": "coreos"
}
],
}

Anything else we need to know:

@jackfrancis
Copy link
Member

@sharmasushant Does this look familar?

@cloudopsguy CoreOS capabilitis in acs-engine are pretty alpha at this point, are you able to reproduce this with Ubuntu?

@sharmasushant
Copy link
Contributor

Most likely subnets do not have enough IPs available to be used.
@cloudopsguy Can you share how large is the subnet?
@tamilmani1989

@cloudopsguy
Copy link
Author

The subnets are small and varied so that may also be an issue however I do not believe it to be related to this issue. In this case my master subnet is a /27 and agent subents are /25.
However after fixing the error on the agent subnet by specifying "ipAddressCount": 7 the errors for the agents subnets vanished and the interfaces were created with the correct number of IP's on those nodes.

The error reported is that the master nodes nic-0 and nic-2 are trying to use the same IP as nic-1:
networkInterfaces/k8s-master-15426332-nic-0/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.4 which is already allocated to resourcenetworkInterfaces/k8s-master-15426332-nic-1.

The IP address in question is also the "firstConsecutiveStaticIP": "10.50.1.4". So either there needs to be no first IP address set or somehow the deployment is specifying reuse of the same IP address.

@tamilmani1989
Copy link
Member

The issue here is that the master that is processed first (in this example master-1) will get 10.50.1.5 as the primary ip, and then 30 available ip's as the secondary ip's to be used with PODs (1.4, 1.6, 1.7, and so on). It does not know that 1.4 and 1.6 are to be used with other two masters.

As a result, when master-0 is getting processed, it has 10.50.1.4 as the first ip statically assigned which cannot happen.

@cloudopsguy To unblock, can you specify the firstConsecutiveStaticIP from the last few addresses of the subnet?

We will have to think about what changes to make in acs-engine to change above behavior.

@cloudopsguy
Copy link
Author

Moving to the upper end of the network seemed to fix the problem. It seems that the pools tend to start from the bottom of the subnet and work their way up so moving up 17 from the bottom fixed the problem since 3 nodes with 5 IPs each only needed 15.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants