[Bugs][kubelet-eks 1.15.9 Rev 25] Ubuntu-EKS node series 20200318.1 (EKS 1.15) cannot join EKS cluster

I don’t know how to report bugs on kubelet-eks, so I put the bug report here.

[Bug Report] - Ubuntu-EKS node series 20200318.1 (EKS 1.15) cannot join EKS cluster

Reproduce procedure

Create EKS WorkerNode following the official instruction

  1. Create EKS cluster (Version 1.15)
  2. In AWS CloudFormation Console, click “Create Stack (With new resources (Standard)”
  3. Upload the CloudFormation official template file
  4. Enter the parameter of the VPC / Security Group / AMI ID (ap-southeast-1: ami-0b3fe54daa57fd53f)
  5. Remote login the WorkerNode to gather the status

WorkerNode Status
Kubelet Error log (in brief)

Mar 27 14:43:08 XXXXXXXXXXXXXX kubelet-eks.daemon[9673]: F0327 14:43:08.194679 9673 server.go:156] unknown flag: --allow-privileged
Mar 27 14:43:08 XXXXXXXXXXXXXX systemd[1]: snap.kubelet-eks.daemon.service: Main process exited, code=exited, status=255/n/a
Mar 27 14:43:08 XXXXXXXXXXXXXX systemd[1]: snap.kubelet-eks.daemon.service: Failed with result ‘exit-code’.
Mar 27 14:43:08 XXXXXXXXXXXXXX systemd[1]: snap.kubelet-eks.daemon.service: Service hold-off time over, scheduling restart.
Mar 27 14:43:08 XXXXXXXXXXXXXX systemd[1]: snap.kubelet-eks.daemon.service: Scheduled restart job, restart counter is at 1.

Root cause
–all-privileged is no longer be supported as valid parameter in kubelet 1.15. Kubelet parameter with –all-privileged will stop kubelet in workernode

Temporarily mitigation
Remove “all-privileged” from kubelet startup parameter

Modification of AWS CloudWatch Template for allowing WorkerNode to join EKS cluster

NodeLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
AssociatePublicIpAddress: ‘false’
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
DeleteOnTermination: true
VolumeSize: !Ref NodeVolumeSize
VolumeType: gp2
IamInstanceProfile: !Ref NodeInstanceProfile
ImageId: !Ref NodeImageId
InstanceType: !Ref NodeInstanceType
KeyName: !Ref KeyName
SecurityGroups:
- !Ref NodeSecurityGroup
- !Ref WorkerNodeSecurityGroup
UserData:
Fn::Base64: !Sub |

       write_files:
       - path: /bin/startup.sh
         permissions: "0755"
         content: |
           #!/bin/bash
           set -o xtrace
           sed -i '/allow-privileged\=true\ \\/d' /etc/eks/bootstrap.sh #HERE ARE THE KEY
           /etc/eks/bootstrap.sh ${ClusterName} ${BootstrapArguments}
           /opt/aws/bin/cfn-signal --exit-code $? \
                 --stack  ${AWS::StackName} \
                 --resource NodeGroup  \
                 --region ${AWS::Region}
       
       
       runcmd:
       - [ /bin/bash, /bin/startup.sh ]

Jean Ng
https://www.linkedin.com/in/jeanbaptisteng/

@joedborg perhaps you can take a look at this?

Also this seems AWS and/or kubernetes specific, so maybe try posting this on their respective forums?

Hello, this is a bug in the image as it provides bootstrap.sh where this is erroneously set. A bug for this issue has been filed @ https://bugs.launchpad.net/cloud-images/+bug/1869562 and you can track progress on new images there. Thank you.

Another workaround is to launch your cluster using the eksctl tool. For example:

eksctl create cluster \
  --name <cluster name> \
  --version 1.15 \
  --nodegroup-name <node group name> \
  --node-type <instance shape> \
  --nodes <n> \
  --nodes-min <n> \
  --nodes-max <n> \
  --node-ami <latest ami for region> \
  --node-ami-family Ubuntu1804 \
  --region <desired region> \
  --ssh-access \
  --ssh-public-key <my key>

The latest EKS-optimized Ubuntu ami-ids can be found on the Ubuntu EKS page.

Belated reply: fix released with the latest EKS 1.15 images, serial 20200406.1 and later