Error created Azure master nodes using azure CNI #1916

cloudopsguy · 2017-12-11T23:47:13Z

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

What version of acs-engine?:
0.10.0

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes 1.8

What happened:
When trying to deploy for a small Azure based docker config something strange is occurring with master nodes network interfaces.
The errors I am encountering are (subscription data stripped)

:

IP configuration /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-2/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.6 which is already allocated to resource /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-1.
and
IP configuration /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-0/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.4 which is already allocated to resource /subscriptions/xxx/resourceGroups/rg-test/providers/Microsoft.Network/networkInterfaces/k8s-master-15426332-nic-1.

What you expected to happen:
I have no idea why multiple interfaces are attempting to take the same IP. I would think they should take unique IP's

How to reproduce it (as minimally and precisely as possible):
I'm not sure, I have attached part of my model for reference.
{
"apiVersion": "vlabs",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Kubernetes",
"orchestratorRelease": "1.8",
"kubernetesConfig": {
"networkPolicy": "azure"
}
},
"masterProfile": {
"count": 3,
"dnsPrefix": "syn-prod",
"vmSize": "Standard_D2_v2",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/k8s-Master-Net",
"firstConsecutiveStaticIP": "10.50.1.4",
"ipAddressCount": 5,
"distro": "coreos"
},
"agentPoolProfiles": [
{
"name": "publicpool",
"count": 3,
"vmSize": "Standard_D2_v2",
"OSDiskSizeGB": 200,
"storageProfile" : "ManagedDisks",
"availabilityProfile": "AvailabilitySet",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/PublicAgentPool",
"ipAddressCount": 7,
"distro": "coreos"
},
{
"name": "privatepool",
"count": 10,
"vmSize": "Standard_D2_v2",
"OSDiskSizeGB": 200,
"storageProfile" : "ManagedDisks",
"availabilityProfile": "AvailabilitySet",
"vnetSubnetId": "/subscriptions/xxx/resourceGroups/Labs-Prod-k8s/providers/Microsoft.Network/virtualNetworks/Labs-Prod/subnets/PrivateAgentPool",
"ipAddressCount": 10,
"distro": "coreos"
}
],
}

Anything else we need to know:

jackfrancis · 2017-12-12T23:55:06Z

@sharmasushant Does this look familar?

@cloudopsguy CoreOS capabilitis in acs-engine are pretty alpha at this point, are you able to reproduce this with Ubuntu?

sharmasushant · 2017-12-12T23:59:21Z

Most likely subnets do not have enough IPs available to be used.
@cloudopsguy Can you share how large is the subnet?
@tamilmani1989

cloudopsguy · 2017-12-13T00:14:45Z

The subnets are small and varied so that may also be an issue however I do not believe it to be related to this issue. In this case my master subnet is a /27 and agent subents are /25.
However after fixing the error on the agent subnet by specifying "ipAddressCount": 7 the errors for the agents subnets vanished and the interfaces were created with the correct number of IP's on those nodes.

The error reported is that the master nodes nic-0 and nic-2 are trying to use the same IP as nic-1:
networkInterfaces/k8s-master-15426332-nic-0/ipConfigurations/ipconfig1 is using the private IP address 10.50.1.4 which is already allocated to resourcenetworkInterfaces/k8s-master-15426332-nic-1.

The IP address in question is also the "firstConsecutiveStaticIP": "10.50.1.4". So either there needs to be no first IP address set or somehow the deployment is specifying reuse of the same IP address.

tamilmani1989 · 2017-12-13T00:47:18Z

The issue here is that the master that is processed first (in this example master-1) will get 10.50.1.5 as the primary ip, and then 30 available ip's as the secondary ip's to be used with PODs (1.4, 1.6, 1.7, and so on). It does not know that 1.4 and 1.6 are to be used with other two masters.

As a result, when master-0 is getting processed, it has 10.50.1.4 as the first ip statically assigned which cannot happen.

@cloudopsguy To unblock, can you specify the firstConsecutiveStaticIP from the last few addresses of the subnet?

We will have to think about what changes to make in acs-engine to change above behavior.

cloudopsguy · 2017-12-14T04:14:38Z

Moving to the upper end of the network seemed to fix the problem. It seems that the pools tend to start from the bottom of the subnet and work their way up so moving up 17 from the bottom fixed the problem since 3 nodes with 5 IPs each only needed 15.

cloudopsguy closed this as completed Dec 14, 2017

tamilmani1989 mentioned this issue Dec 23, 2017

Improving IP address assignment for master nodes with Azure CNI. #1966

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error created Azure master nodes using azure CNI #1916

Error created Azure master nodes using azure CNI #1916

cloudopsguy commented Dec 11, 2017 •

edited

Loading

jackfrancis commented Dec 12, 2017

sharmasushant commented Dec 12, 2017

cloudopsguy commented Dec 13, 2017

tamilmani1989 commented Dec 13, 2017

cloudopsguy commented Dec 14, 2017

Error created Azure master nodes using azure CNI #1916

Error created Azure master nodes using azure CNI #1916

Comments

cloudopsguy commented Dec 11, 2017 • edited Loading

Is this a request for help?: Yes

Is this an ISSUE or FEATURE REQUEST? (choose one): Issue

What version of acs-engine?: 0.10.0

jackfrancis commented Dec 12, 2017

sharmasushant commented Dec 12, 2017

cloudopsguy commented Dec 13, 2017

tamilmani1989 commented Dec 13, 2017

cloudopsguy commented Dec 14, 2017

cloudopsguy commented Dec 11, 2017 •

edited

Loading

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

What version of acs-engine?:
0.10.0