Replace godep with dep

This commit is contained in:
Manuel de Brito Fontes 2017-10-06 17:26:14 -03:00
parent 1e7489927c
commit bf5616c65b
14883 changed files with 3937406 additions and 361781 deletions

View file

@ -0,0 +1,25 @@
# Kubernetes Worker
### Building from the layer
You can clone the kubernetes-worker layer with git and build locally if you
have the charm package/snap installed.
```shell
# Instal the snap
sudo snap install charm --channel=edge
# Set the build environment
export JUJU_REPOSITORY=$HOME
# Clone the layer and build it to our JUJU_REPOSITORY
git clone https://github.com/juju-solutions/kubernetes
cd kubernetes/cluster/juju/layers/kubernetes-worker
charm build -r
```
### Contributing
TBD

View file

@ -0,0 +1,100 @@
# Kubernetes Worker
## Usage
This charm deploys a container runtime, and additionally stands up the Kubernetes
worker applications: kubelet, and kube-proxy.
In order for this charm to be useful, it should be deployed with its companion
charm [kubernetes-master](https://jujucharms.com/u/containers/kubernetes-master)
and linked with an SDN-Plugin.
This charm has also been bundled up for your convenience so you can skip the
above steps, and deploy it with a single command:
```shell
juju deploy canonical-kubernetes
```
For more information about [Canonical Kubernetes](https://jujucharms.com/canonical-kubernetes)
consult the bundle `README.md` file.
## Scale out
To add additional compute capacity to your Kubernetes workers, you may
`juju add-unit` scale the cluster of applications. They will automatically
join any related kubernetes-master, and enlist themselves as ready once the
deployment is complete.
## Operational actions
The kubernetes-worker charm supports the following Operational Actions:
#### Pause
Pausing the workload enables administrators to both [drain](http://kubernetes.io/docs/user-guide/kubectl/kubectl_drain/) and [cordon](http://kubernetes.io/docs/user-guide/kubectl/kubectl_cordon/)
a unit for maintenance.
#### Resume
Resuming the workload will [uncordon](http://kubernetes.io/docs/user-guide/kubectl/kubectl_uncordon/) a paused unit. Workloads will automatically migrate unless otherwise directed via their application declaration.
## Private registry
With the "registry" action that is part for the kubernetes-worker charm, you can very easily create a private docker registry, with authentication, and available over TLS. Please note that the registry deployed with the action is not HA, and uses storage tied to the kubernetes node where the pod is running. So if the registry pod changes is migrated from one node to another for whatever reason, you will need to re-publish the images.
### Example usage
Create the relevant authentication files. Let's say you want user `userA` to authenticate with the password `passwordA`. Then you'll do :
echo -n "userA:passwordA" > htpasswd-plain
htpasswd -c -b -B htpasswd userA passwordA
(the `htpasswd` program comes with the `apache2-utils` package)
Supposing your registry will be reachable at `myregistry.company.com`, and that you already have your TLS key in the `registry.key` file, and your TLS certificate (with `myregistry.company.com` as Common Name) in the `registry.crt` file, you would then run :
juju run-action kubernetes-worker/0 registry domain=myregistry.company.com htpasswd="$(base64 -w0 htpasswd)" htpasswd-plain="$(base64 -w0 htpasswd-plain)" tlscert="$(base64 -w0 registry.crt)" tlskey="$(base64 -w0 registry.key)" ingress=true
If you then decide that you want do delete the registry, just run :
juju run-action kubernetes-worker/0 registry delete=true ingress=true
## Known Limitations
Kubernetes workers currently only support 'phaux' HA scenarios. Even when configured with an HA cluster string, they will only ever contact the first unit in the cluster map. To enable a proper HA story, kubernetes-worker units are encouraged to proxy through a [kubeapi-load-balancer](https://jujucharms.com/kubeapi-load-balancer)
application. This enables a HA deployment without the need to
re-render configuration and disrupt the worker services.
External access to pods must be performed through a [Kubernetes
Ingress Resource](http://kubernetes.io/docs/user-guide/ingress/).
When using NodePort type networking, there is no automation in exposing the
ports selected by kubernetes or chosen by the user. They will need to be
opened manually and can be performed across an entire worker pool.
If your NodePort service port selected is `30510` you can open this across all
members of a worker pool named `kubernetes-worker` like so:
```
juju run --application kubernetes-worker open-port 30510/tcp
```
Don't forget to expose the kubernetes-worker application if its not already
exposed, as this can cause confusion once the port has been opened and the
service is not reachable.
Note: When debugging connection issues with NodePort services, its important
to first check the kube-proxy service on the worker units. If kube-proxy is not
running, the associated port-mapping will not be configured in the iptables
rulechains.
If you need to close the NodePort once a workload has been terminated, you can
follow the same steps inversely.
```
juju run --application kubernetes-worker close-port 30510
```

View file

@ -0,0 +1,56 @@
pause:
description: |
Cordon the unit, draining all active workloads.
params:
delete-local-data:
type: boolean
description: Force deletion of local storage to enable a drain
default: False
force:
type: boolean
description: |
Continue even if there are pods not managed by a RC, RS, Job, DS or SS
default: False
resume:
description: |
UnCordon the unit, enabling workload scheduling.
microbot:
description: Launch microbot containers
params:
replicas:
type: integer
default: 3
description: Number of microbots to launch in Kubernetes.
delete:
type: boolean
default: False
description: Remove a microbots deployment, service, and ingress if True.
upgrade:
description: Upgrade the kubernetes snaps
registry:
description: Create a private Docker registry
params:
htpasswd:
type: string
description: base64 encoded htpasswd file used for authentication.
htpasswd-plain:
type: string
description: base64 encoded plaintext version of the htpasswd file, needed by docker daemons to authenticate to the registry.
tlscert:
type: string
description: base64 encoded TLS certificate for the registry. Common Name must match the domain name of the registry.
tlskey:
type: string
description: base64 encoded TLS key for the registry.
domain:
type: string
description: The domain name for the registry. Must match the Common Name of the certificate.
ingress:
type: boolean
default: false
description: Create an Ingress resource for the registry (or delete resource object if "delete" is True)
delete:
type: boolean
default: false
description: Remove a registry replication controller, service, and ingress if True.

View file

@ -0,0 +1,73 @@
#!/usr/bin/env python3
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
from charmhelpers.core.hookenv import action_get
from charmhelpers.core.hookenv import action_set
from charmhelpers.core.hookenv import unit_public_ip
from charms.templating.jinja2 import render
from subprocess import call
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
context = {}
context['replicas'] = action_get('replicas')
context['delete'] = action_get('delete')
context['public_address'] = unit_public_ip()
if not context['replicas']:
context['replicas'] = 3
# Declare a kubectl template when invoking kubectl
kubectl = ['kubectl', '--kubeconfig=/root/cdk/kubeconfig']
# Remove deployment if requested
if context['delete']:
service_del = kubectl + ['delete', 'svc', 'microbot']
service_response = call(service_del)
deploy_del = kubectl + ['delete', 'deployment', 'microbot']
deploy_response = call(deploy_del)
ingress_del = kubectl + ['delete', 'ing', 'microbot-ingress']
ingress_response = call(ingress_del)
if ingress_response != 0:
action_set({'microbot-ing':
'Failed removal of microbot ingress resource.'})
if deploy_response != 0:
action_set({'microbot-deployment':
'Failed removal of microbot deployment resource.'})
if service_response != 0:
action_set({'microbot-service':
'Failed removal of microbot service resource.'})
sys.exit(0)
# Creation request
render('microbot-example.yaml', '/root/cdk/addons/microbot.yaml',
context)
create_command = kubectl + ['create', '-f',
'/root/cdk/addons/microbot.yaml']
create_response = call(create_command)
if create_response == 0:
action_set({'address':
'microbot.{}.xip.io'.format(context['public_address'])})
else:
action_set({'microbot-create': 'Failed microbot creation.'})

View file

@ -0,0 +1,28 @@
#!/bin/bash
set -ex
export PATH=$PATH:/snap/bin
DELETE_LOCAL_DATA=$(action-get delete-local-data)
FORCE=$(action-get force)
# placeholder for additional flags to the command
export EXTRA_FLAGS=""
# Determine if we have extra flags
if [[ "${DELETE_LOCAL_DATA}" == "True" || "${DELETE_LOCAL_DATA}" == "true" ]]; then
EXTRA_FLAGS="${EXTRA_FLAGS} --delete-local-data=true"
fi
if [[ "${FORCE}" == "True" || "${FORCE}" == "true" ]]; then
EXTRA_FLAGS="${EXTRA_FLAGS} --force"
fi
# Cordon and drain the unit
kubectl --kubeconfig=/root/cdk/kubeconfig cordon $(hostname)
kubectl --kubeconfig=/root/cdk/kubeconfig drain $(hostname) ${EXTRA_FLAGS}
# Set status to indicate the unit is paused and under maintenance.
status-set 'waiting' 'Kubernetes unit paused'

View file

@ -0,0 +1,136 @@
#!/usr/bin/python3
#
# For a usage examples, see README.md
#
# TODO
#
# - make the action idempotent (i.e. if you run it multiple times, the first
# run will create/delete the registry, and the reset will be a no-op and won't
# error out)
#
# - take only a plain authentication file, and create the encrypted version in
# the action
#
# - validate the parameters (make sure tlscert is a certificate, that tlskey is a
# proper key, etc)
#
# - when https://bugs.launchpad.net/juju/+bug/1661015 is fixed, handle the
# base64 encoding the parameters in the action itself
import os
import sys
from base64 import b64encode
from charmhelpers.core.hookenv import action_get
from charmhelpers.core.hookenv import action_set
from charms.templating.jinja2 import render
from subprocess import call
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
deletion = action_get('delete')
context = {}
# These config options must be defined in the case of a creation
param_error = False
for param in ('tlscert', 'tlskey', 'domain', 'htpasswd', 'htpasswd-plain'):
value = action_get(param)
if not value and not deletion:
key = "registry-create-parameter-{}".format(param)
error = "failure, parameter {} is required".format(param)
action_set({key: error})
param_error = True
context[param] = value
# Create the dockercfg template variable
dockercfg = '{"%s": {"auth": "%s", "email": "root@localhost"}}' % \
(context['domain'], context['htpasswd-plain'])
context['dockercfg'] = b64encode(dockercfg.encode()).decode('ASCII')
if param_error:
sys.exit(0)
# This one is either true or false, no need to check if it has a "good" value.
context['ingress'] = action_get('ingress')
# Declare a kubectl template when invoking kubectl
kubectl = ['kubectl', '--kubeconfig=/root/cdk/kubeconfig']
# Remove deployment if requested
if deletion:
resources = ['svc/kube-registry', 'rc/kube-registry-v0', 'secrets/registry-tls-data',
'secrets/registry-auth-data', 'secrets/registry-access']
if action_get('ingress'):
resources.append('ing/registry-ing')
delete_command = kubectl + ['delete', '--ignore-not-found=true'] + resources
delete_response = call(delete_command)
if delete_response == 0:
action_set({'registry-delete': 'success'})
else:
action_set({'registry-delete': 'failure'})
sys.exit(0)
# Creation request
render('registry.yaml', '/root/cdk/addons/registry.yaml',
context)
create_command = kubectl + ['create', '-f',
'/root/cdk/addons/registry.yaml']
create_response = call(create_command)
if create_response == 0:
action_set({'registry-create': 'success'})
# Create a ConfigMap if it doesn't exist yet, else patch it.
# A ConfigMap is needed to change the default value for nginx' client_max_body_size.
# The default is 1MB, and this is the maximum size of images that can be
# pushed on the registry. 1MB images aren't useful, so we bump this value to 1024MB.
cm_name = 'nginx-load-balancer-conf'
check_cm_command = kubectl + ['get', 'cm', cm_name]
check_cm_response = call(check_cm_command)
if check_cm_response == 0:
# There is an existing ConfigMap, patch it
patch = '{"data":{"body-size":"1024m"}}'
patch_cm_command = kubectl + ['patch', 'cm', cm_name, '-p', patch]
patch_cm_response = call(patch_cm_command)
if patch_cm_response == 0:
action_set({'configmap-patch': 'success'})
else:
action_set({'configmap-patch': 'failure'})
else:
# No existing ConfigMap, create it
render('registry-configmap.yaml', '/root/cdk/addons/registry-configmap.yaml',
context)
create_cm_command = kubectl + ['create', '-f', '/root/cdk/addons/registry-configmap.yaml']
create_cm_response = call(create_cm_command)
if create_cm_response == 0:
action_set({'configmap-create': 'success'})
else:
action_set({'configmap-create': 'failure'})
# Patch the "default" serviceaccount with an imagePullSecret.
# This will allow the docker daemons to authenticate to our private
# registry automatically
patch = '{"imagePullSecrets":[{"name":"registry-access"}]}'
patch_sa_command = kubectl + ['patch', 'sa', 'default', '-p', patch]
patch_sa_response = call(patch_sa_command)
if patch_sa_response == 0:
action_set({'serviceaccount-patch': 'success'})
else:
action_set({'serviceaccount-patch': 'failure'})
else:
action_set({'registry-create': 'failure'})

View file

@ -0,0 +1,8 @@
#!/bin/bash
set -ex
export PATH=$PATH:/snap/bin
kubectl --kubeconfig=/root/cdk/kubeconfig uncordon $(hostname)
status-set 'active' 'Kubernetes unit resumed'

View file

@ -0,0 +1,5 @@
#!/bin/sh
set -eux
charms.reactive set_state kubernetes-worker.snaps.upgrade-specified
exec hooks/config-changed

View file

@ -0,0 +1,33 @@
options:
ingress:
type: boolean
default: true
description: |
Deploy the default http backend and ingress controller to handle
ingress requests.
labels:
type: string
default: ""
description: |
Labels can be used to organize and to select subsets of nodes in the
cluster. Declare node labels in key=value format, separated by spaces.
allow-privileged:
type: string
default: "auto"
description: |
Allow privileged containers to run on worker nodes. Supported values are
"true", "false", and "auto". If "true", kubelet will run in privileged
mode by default. If "false", kubelet will never run in privileged mode.
If "auto", kubelet will not run in privileged mode by default, but will
switch to privileged mode if gpu hardware is detected.
channel:
type: string
default: "1.7/stable"
description: |
Snap channel to install Kubernetes worker services from
require-manual-upgrade:
type: boolean
default: true
description: |
When true, worker services will not be upgraded until the user triggers
it manually by running the upgrade action.

View file

@ -0,0 +1,13 @@
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View file

@ -0,0 +1,8 @@
#!/bin/sh
set -ux
# We had to bump inotify limits once in the past, hence why this oddly specific
# script lives here in kubernetes-worker.
sysctl fs.inotify > $DEBUG_SCRIPT_DIR/sysctl-limits
ls -l /proc/*/fd/* | grep inotify > $DEBUG_SCRIPT_DIR/inotify-instances

View file

@ -0,0 +1,15 @@
#!/bin/sh
set -ux
export PATH=$PATH:/snap/bin
alias kubectl="kubectl --kubeconfig=/root/cdk/kubeconfig"
kubectl cluster-info > $DEBUG_SCRIPT_DIR/cluster-info
kubectl cluster-info dump > $DEBUG_SCRIPT_DIR/cluster-info-dump
for obj in pods svc ingress secrets pv pvc rc; do
kubectl describe $obj --all-namespaces > $DEBUG_SCRIPT_DIR/describe-$obj
done
for obj in nodes; do
kubectl describe $obj > $DEBUG_SCRIPT_DIR/describe-$obj
done

View file

@ -0,0 +1,9 @@
#!/bin/sh
set -ux
for service in kubelet kube-proxy; do
systemctl status snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-systemctl-status
journalctl -u snap.$service.daemon > $DEBUG_SCRIPT_DIR/$service-journal
done
# FIXME: get the snap config or something

View file

@ -0,0 +1,2 @@
# This stubs out charm-pre-install coming from layer-docker as a workaround for
# offline installs until https://github.com/juju/charm-tools/issues/301 is fixed.

View file

@ -0,0 +1,17 @@
#!/bin/bash
MY_HOSTNAME=$(hostname)
: ${JUJU_UNIT_NAME:=`uuidgen`}
if [ "${MY_HOSTNAME}" == "ubuntuguest" ]; then
juju-log "Detected broken vsphere integration. Applying hostname override"
FRIENDLY_HOSTNAME=$(echo $JUJU_UNIT_NAME | tr / -)
juju-log "Setting hostname to $FRIENDLY_HOSTNAME"
if [ ! -f /etc/hostname.orig ]; then
mv /etc/hostname /etc/hostname.orig
fi
echo "${FRIENDLY_HOSTNAME}" > /etc/hostname
hostname $FRIENDLY_HOSTNAME
fi

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 26 KiB

View file

@ -0,0 +1,31 @@
repo: https://github.com/kubernetes/kubernetes.git
includes:
- 'layer:basic'
- 'layer:debug'
- 'layer:snap'
- 'layer:docker'
- 'layer:metrics'
- 'layer:nagios'
- 'layer:tls-client'
- 'layer:nvidia-cuda'
- 'interface:http'
- 'interface:kubernetes-cni'
- 'interface:kube-dns'
- 'interface:kube-control'
config:
deletes:
- install_from_upstream
options:
basic:
packages:
- 'cifs-utils'
- 'ceph-common'
- 'nfs-common'
- 'socat'
- 'virt-what'
tls-client:
ca_certificate_path: '/root/cdk/ca.crt'
server_certificate_path: '/root/cdk/server.crt'
server_key_path: '/root/cdk/server.key'
client_certificate_path: '/root/cdk/client.crt'
client_key_path: '/root/cdk/client.key'

View file

@ -0,0 +1,35 @@
#!/usr/bin/env python
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import re
import subprocess
def get_version(bin_name):
"""Get the version of an installed Kubernetes binary.
:param str bin_name: Name of binary
:return: 3-tuple version (maj, min, patch)
Example::
>>> `get_version('kubelet')
(1, 6, 0)
"""
cmd = '{} --version'.format(bin_name).split()
version_string = subprocess.check_output(cmd).decode('utf-8')
return tuple(int(q) for q in re.findall("[0-9]+", version_string)[:3])

View file

@ -0,0 +1,149 @@
#!/usr/bin/env python
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from charmhelpers.core import unitdata
class FlagManager:
'''
FlagManager - A Python class for managing the flags to pass to an
application without remembering what's been set previously.
This is a blind class assuming the operator knows what they are doing.
Each instance of this class should be initialized with the intended
application to manage flags. Flags are then appended to a data-structure
and cached in unitdata for later recall.
THe underlying data-provider is backed by a SQLITE database on each unit,
tracking the dictionary, provided from the 'charmhelpers' python package.
Summary:
opts = FlagManager('docker')
opts.add('bip', '192.168.22.2')
opts.to_s()
'''
def __init__(self, daemon, opts_path=None):
self.db = unitdata.kv()
self.daemon = daemon
if not self.db.get(daemon):
self.data = {}
else:
self.data = self.db.get(daemon)
def __save(self):
self.db.set(self.daemon, self.data)
def add(self, key, value, strict=False):
'''
Adds data to the map of values for the DockerOpts file.
Supports single values, or "multiopt variables". If you
have a flag only option, like --tlsverify, set the value
to None. To preserve the exact value, pass strict
eg:
opts.add('label', 'foo')
opts.add('label', 'foo, bar, baz')
opts.add('flagonly', None)
opts.add('cluster-store', 'consul://a:4001,b:4001,c:4001/swarm',
strict=True)
'''
if strict:
self.data['{}-strict'.format(key)] = value
self.__save()
return
if value:
values = [x.strip() for x in value.split(',')]
# handle updates
if key in self.data and self.data[key] is not None:
item_data = self.data[key]
for c in values:
c = c.strip()
if c not in item_data:
item_data.append(c)
self.data[key] = item_data
else:
# handle new
self.data[key] = values
else:
# handle flagonly
self.data[key] = None
self.__save()
def remove(self, key, value):
'''
Remove a flag value from the DockerOpts manager
Assuming the data is currently {'foo': ['bar', 'baz']}
d.remove('foo', 'bar')
> {'foo': ['baz']}
:params key:
:params value:
'''
self.data[key].remove(value)
self.__save()
def destroy(self, key, strict=False):
'''
Destructively remove all values and key from the FlagManager
Assuming the data is currently {'foo': ['bar', 'baz']}
d.wipe('foo')
>{}
:params key:
:params strict:
'''
try:
if strict:
self.data.pop('{}-strict'.format(key))
else:
self.data.pop(key)
self.__save()
except KeyError:
pass
def get(self, key, default=None):
"""Return the value for ``key``, or the default if ``key`` doesn't exist.
"""
return self.data.get(key, default)
def destroy_all(self):
'''
Destructively removes all data from the FlagManager.
'''
self.data.clear()
self.__save()
def to_s(self):
'''
Render the flags to a single string, prepared for the Docker
Defaults file. Typically in /etc/default/docker
d.to_s()
> "--foo=bar --foo=baz"
'''
flags = []
for key in self.data:
if self.data[key] is None:
# handle flagonly
flags.append("{}".format(key))
elif '-strict' in key:
# handle strict values, and do it in 2 steps.
# If we rstrip -strict it strips a tailing s
proper_key = key.rstrip('strict').rstrip('-')
flags.append("{}={}".format(proper_key, self.data[key]))
else:
# handle multiopt and typical flags
for item in self.data[key]:
flags.append("{}={}".format(key, item))
return ' '.join(flags)

View file

@ -0,0 +1,51 @@
name: kubernetes-worker
summary: The workload bearing units of a kubernetes cluster
maintainers:
- Tim Van Steenburgh <tim.van.steenburgh@canonical.com>
- George Kraft <george.kraft@canonical.com>
- Rye Terrell <rye.terrell@canonical.com>
- Konstantinos Tsakalozos <kos.tsakalozos@canonical.com>
- Charles Butler <Chuck@dasroot.net>
- Matthew Bruzek <mbruzek@ubuntu.com>
description: |
Kubernetes is an open-source platform for deploying, scaling, and operations
of application containers across a cluster of hosts. Kubernetes is portable
in that it works with public, private, and hybrid clouds. Extensible through
a pluggable infrastructure. Self healing in that it will automatically
restart and place containers on healthy nodes if a node ever goes away.
tags:
- misc
series:
- xenial
subordinate: false
requires:
kube-api-endpoint:
interface: http
kube-dns:
# kube-dns is deprecated. Its functionality has been rolled into the
# kube-control interface. The kube-dns relation will be removed in
# a future release.
interface: kube-dns
kube-control:
interface: kube-control
provides:
cni:
interface: kubernetes-cni
scope: container
resources:
cni:
type: file
filename: cni.tgz
description: CNI plugins
kubectl:
type: file
filename: kubectl.snap
description: kubectl snap
kubelet:
type: file
filename: kubelet.snap
description: kubelet snap
kube-proxy:
type: file
filename: kube-proxy.snap
description: kube-proxy snap

View file

@ -0,0 +1,2 @@
metrics:
juju-units: {}

View file

@ -0,0 +1,889 @@
#!/usr/bin/env python
# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import random
import shutil
import subprocess
import time
from shlex import split
from subprocess import check_call, check_output
from subprocess import CalledProcessError
from socket import gethostname
from charms import layer
from charms.layer import snap
from charms.reactive import hook
from charms.reactive import set_state, remove_state, is_state
from charms.reactive import when, when_any, when_not
from charms.kubernetes.common import get_version
from charms.kubernetes.flagmanager import FlagManager
from charms.reactive.helpers import data_changed, any_file_changed
from charms.templating.jinja2 import render
from charmhelpers.core import hookenv, unitdata
from charmhelpers.core.host import service_stop, service_restart
from charmhelpers.contrib.charmsupport import nrpe
# Override the default nagios shortname regex to allow periods, which we
# need because our bin names contain them (e.g. 'snap.foo.daemon'). The
# default regex in charmhelpers doesn't allow periods, but nagios itself does.
nrpe.Check.shortname_re = '[\.A-Za-z0-9-_]+$'
kubeconfig_path = '/root/cdk/kubeconfig'
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
db = unitdata.kv()
@hook('upgrade-charm')
def upgrade_charm():
# Trigger removal of PPA docker installation if it was previously set.
set_state('config.changed.install_from_upstream')
hookenv.atexit(remove_state, 'config.changed.install_from_upstream')
cleanup_pre_snap_services()
check_resources_for_upgrade_needed()
# Remove gpu.enabled state so we can reconfigure gpu-related kubelet flags,
# since they can differ between k8s versions
remove_state('kubernetes-worker.gpu.enabled')
kubelet_opts = FlagManager('kubelet')
kubelet_opts.destroy('feature-gates')
kubelet_opts.destroy('experimental-nvidia-gpus')
remove_state('kubernetes-worker.cni-plugins.installed')
remove_state('kubernetes-worker.config.created')
remove_state('kubernetes-worker.ingress.available')
set_state('kubernetes-worker.restart-needed')
def check_resources_for_upgrade_needed():
hookenv.status_set('maintenance', 'Checking resources')
resources = ['kubectl', 'kubelet', 'kube-proxy']
paths = [hookenv.resource_get(resource) for resource in resources]
if any_file_changed(paths):
set_upgrade_needed()
def set_upgrade_needed():
set_state('kubernetes-worker.snaps.upgrade-needed')
config = hookenv.config()
previous_channel = config.previous('channel')
require_manual = config.get('require-manual-upgrade')
if previous_channel is None or not require_manual:
set_state('kubernetes-worker.snaps.upgrade-specified')
def cleanup_pre_snap_services():
# remove old states
remove_state('kubernetes-worker.components.installed')
# disable old services
services = ['kubelet', 'kube-proxy']
for service in services:
hookenv.log('Stopping {0} service.'.format(service))
service_stop(service)
# cleanup old files
files = [
"/lib/systemd/system/kubelet.service",
"/lib/systemd/system/kube-proxy.service",
"/etc/default/kube-default",
"/etc/default/kubelet",
"/etc/default/kube-proxy",
"/srv/kubernetes",
"/usr/local/bin/kubectl",
"/usr/local/bin/kubelet",
"/usr/local/bin/kube-proxy",
"/etc/kubernetes"
]
for file in files:
if os.path.isdir(file):
hookenv.log("Removing directory: " + file)
shutil.rmtree(file)
elif os.path.isfile(file):
hookenv.log("Removing file: " + file)
os.remove(file)
# cleanup old flagmanagers
FlagManager('kubelet').destroy_all()
FlagManager('kube-proxy').destroy_all()
@when('config.changed.channel')
def channel_changed():
set_upgrade_needed()
@when('kubernetes-worker.snaps.upgrade-needed')
@when_not('kubernetes-worker.snaps.upgrade-specified')
def upgrade_needed_status():
msg = 'Needs manual upgrade, run the upgrade action'
hookenv.status_set('blocked', msg)
@when('kubernetes-worker.snaps.upgrade-specified')
def install_snaps():
check_resources_for_upgrade_needed()
channel = hookenv.config('channel')
hookenv.status_set('maintenance', 'Installing kubectl snap')
snap.install('kubectl', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kubelet snap')
snap.install('kubelet', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kube-proxy snap')
snap.install('kube-proxy', channel=channel, classic=True)
set_state('kubernetes-worker.snaps.installed')
set_state('kubernetes-worker.restart-needed')
remove_state('kubernetes-worker.snaps.upgrade-needed')
remove_state('kubernetes-worker.snaps.upgrade-specified')
@hook('stop')
def shutdown():
''' When this unit is destroyed:
- delete the current node
- stop the worker services
'''
try:
if os.path.isfile(kubeconfig_path):
kubectl('delete', 'node', gethostname())
except CalledProcessError:
hookenv.log('Failed to unregister node.')
service_stop('snap.kubelet.daemon')
service_stop('snap.kube-proxy.daemon')
@when('docker.available')
@when_not('kubernetes-worker.cni-plugins.installed')
def install_cni_plugins():
''' Unpack the cni-plugins resource '''
charm_dir = os.getenv('CHARM_DIR')
# Get the resource via resource_get
try:
archive = hookenv.resource_get('cni')
except Exception:
message = 'Error fetching the cni resource.'
hookenv.log(message)
hookenv.status_set('blocked', message)
return
if not archive:
hookenv.log('Missing cni resource.')
hookenv.status_set('blocked', 'Missing cni resource.')
return
# Handle null resource publication, we check if filesize < 1mb
filesize = os.stat(archive).st_size
if filesize < 1000000:
hookenv.status_set('blocked', 'Incomplete cni resource.')
return
hookenv.status_set('maintenance', 'Unpacking cni resource.')
unpack_path = '{}/files/cni'.format(charm_dir)
os.makedirs(unpack_path, exist_ok=True)
cmd = ['tar', 'xfvz', archive, '-C', unpack_path]
hookenv.log(cmd)
check_call(cmd)
apps = [
{'name': 'loopback', 'path': '/opt/cni/bin'}
]
for app in apps:
unpacked = '{}/{}'.format(unpack_path, app['name'])
app_path = os.path.join(app['path'], app['name'])
install = ['install', '-v', '-D', unpacked, app_path]
hookenv.log(install)
check_call(install)
# Used by the "registry" action. The action is run on a single worker, but
# the registry pod can end up on any worker, so we need this directory on
# all the workers.
os.makedirs('/srv/registry', exist_ok=True)
set_state('kubernetes-worker.cni-plugins.installed')
@when('kubernetes-worker.snaps.installed')
def set_app_version():
''' Declare the application version to juju '''
cmd = ['kubelet', '--version']
version = check_output(cmd)
hookenv.application_version_set(version.split(b' v')[-1].rstrip())
@when('kubernetes-worker.snaps.installed')
@when_not('kube-control.dns.available')
def notify_user_transient_status():
''' Notify to the user we are in a transient state and the application
is still converging. Potentially remotely, or we may be in a detached loop
wait state '''
# During deployment the worker has to start kubelet without cluster dns
# configured. If this is the first unit online in a service pool waiting
# to self host the dns pod, and configure itself to query the dns service
# declared in the kube-system namespace
hookenv.status_set('waiting', 'Waiting for cluster DNS.')
@when('kubernetes-worker.snaps.installed',
'kube-control.dns.available')
@when_not('kubernetes-worker.snaps.upgrade-needed')
def charm_status(kube_control):
'''Update the status message with the current status of kubelet.'''
update_kubelet_status()
def update_kubelet_status():
''' There are different states that the kubelet can be in, where we are
waiting for dns, waiting for cluster turnup, or ready to serve
applications.'''
services = [
'kubelet',
'kube-proxy'
]
failing_services = []
for service in services:
daemon = 'snap.{}.daemon'.format(service)
if not _systemctl_is_active(daemon):
failing_services.append(service)
if len(failing_services) == 0:
hookenv.status_set('active', 'Kubernetes worker running.')
else:
msg = 'Waiting for {} to start.'.format(','.join(failing_services))
hookenv.status_set('waiting', msg)
@when('certificates.available')
def send_data(tls):
'''Send the data that is required to create a server certificate for
this server.'''
# Use the public ip of this unit as the Common Name for the certificate.
common_name = hookenv.unit_public_ip()
# Create SANs that the tls layer will add to the server cert.
sans = [
hookenv.unit_public_ip(),
hookenv.unit_private_ip(),
gethostname()
]
# Create a path safe name by removing path characters from the unit name.
certificate_name = hookenv.local_unit().replace('/', '_')
# Request a server cert with this information.
tls.request_server_cert(common_name, sans, certificate_name)
@when('kube-api-endpoint.available', 'kube-control.dns.available',
'cni.available')
def watch_for_changes(kube_api, kube_control, cni):
''' Watch for configuration changes and signal if we need to restart the
worker services '''
servers = get_kube_api_servers(kube_api)
dns = kube_control.get_dns()
cluster_cidr = cni.get_config()['cidr']
if (data_changed('kube-api-servers', servers) or
data_changed('kube-dns', dns) or
data_changed('cluster-cidr', cluster_cidr)):
set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.snaps.installed', 'kube-api-endpoint.available',
'tls_client.ca.saved', 'tls_client.client.certificate.saved',
'tls_client.client.key.saved', 'tls_client.server.certificate.saved',
'tls_client.server.key.saved',
'kube-control.dns.available', 'kube-control.auth.available',
'cni.available', 'kubernetes-worker.restart-needed')
def start_worker(kube_api, kube_control, auth_control, cni):
''' Start kubelet using the provided API and DNS info.'''
servers = get_kube_api_servers(kube_api)
# Note that the DNS server doesn't necessarily exist at this point. We know
# what its IP will eventually be, though, so we can go ahead and configure
# kubelet with that info. This ensures that early pods are configured with
# the correct DNS even though the server isn't ready yet.
dns = kube_control.get_dns()
cluster_cidr = cni.get_config()['cidr']
if cluster_cidr is None:
hookenv.log('Waiting for cluster cidr.')
return
creds = kube_control.get_auth_credentials()
data_changed('kube-control.creds', creds)
# set --allow-privileged flag for kubelet
set_privileged()
create_config(random.choice(servers), creds)
configure_worker_services(servers, dns, cluster_cidr)
set_state('kubernetes-worker.config.created')
restart_unit_services()
update_kubelet_status()
apply_node_labels()
remove_state('kubernetes-worker.restart-needed')
@when('cni.connected')
@when_not('cni.configured')
def configure_cni(cni):
''' Set worker configuration on the CNI relation. This lets the CNI
subordinate know that we're the worker so it can respond accordingly. '''
cni.set_config(is_master=False, kubeconfig_path=kubeconfig_path)
@when('config.changed.ingress')
def toggle_ingress_state():
''' Ingress is a toggled state. Remove ingress.available if set when
toggled '''
remove_state('kubernetes-worker.ingress.available')
@when('docker.sdn.configured')
def sdn_changed():
'''The Software Defined Network changed on the container so restart the
kubernetes services.'''
restart_unit_services()
update_kubelet_status()
remove_state('docker.sdn.configured')
@when('kubernetes-worker.config.created')
@when_not('kubernetes-worker.ingress.available')
def render_and_launch_ingress():
''' If configuration has ingress RC enabled, launch the ingress load
balancer and default http backend. Otherwise attempt deletion. '''
config = hookenv.config()
# If ingress is enabled, launch the ingress controller
if config.get('ingress'):
launch_default_ingress_controller()
else:
hookenv.log('Deleting the http backend and ingress.')
kubectl_manifest('delete',
'/root/cdk/addons/default-http-backend.yaml')
kubectl_manifest('delete',
'/root/cdk/addons/ingress-replication-controller.yaml') # noqa
hookenv.close_port(80)
hookenv.close_port(443)
@when('kubernetes-worker.ingress.available')
def scale_ingress_controller():
''' Scale the number of ingress controller replicas to match the number of
nodes. '''
try:
output = kubectl('get', 'nodes', '-o', 'name')
count = len(output.splitlines())
kubectl('scale', '--replicas=%d' % count, 'rc/nginx-ingress-controller') # noqa
except CalledProcessError:
hookenv.log('Failed to scale ingress controllers. Will attempt again next update.') # noqa
@when('config.changed.labels', 'kubernetes-worker.config.created')
def apply_node_labels():
''' Parse the labels configuration option and apply the labels to the node.
'''
# scrub and try to format an array from the configuration option
config = hookenv.config()
user_labels = _parse_labels(config.get('labels'))
# For diffing sake, iterate the previous label set
if config.previous('labels'):
previous_labels = _parse_labels(config.previous('labels'))
hookenv.log('previous labels: {}'.format(previous_labels))
else:
# this handles first time run if there is no previous labels config
previous_labels = _parse_labels("")
# Calculate label removal
for label in previous_labels:
if label not in user_labels:
hookenv.log('Deleting node label {}'.format(label))
_apply_node_label(label, delete=True)
# if the label is in user labels we do nothing here, it will get set
# during the atomic update below.
# Atomically set a label
for label in user_labels:
_apply_node_label(label, overwrite=True)
def arch():
'''Return the package architecture as a string. Raise an exception if the
architecture is not supported by kubernetes.'''
# Get the package architecture for this system.
architecture = check_output(['dpkg', '--print-architecture']).rstrip()
# Convert the binary result into a string.
architecture = architecture.decode('utf-8')
return architecture
def create_config(server, creds):
'''Create a kubernetes configuration for the worker unit.'''
# Get the options from the tls-client layer.
layer_options = layer.options('tls-client')
# Get all the paths to the tls information required for kubeconfig.
ca = layer_options.get('ca_certificate_path')
# Create kubernetes configuration in the default location for ubuntu.
create_kubeconfig('/home/ubuntu/.kube/config', server, ca,
token=creds['client_token'], user='ubuntu')
# Make the config dir readable by the ubuntu users so juju scp works.
cmd = ['chown', '-R', 'ubuntu:ubuntu', '/home/ubuntu/.kube']
check_call(cmd)
# Create kubernetes configuration in the default location for root.
create_kubeconfig('/root/.kube/config', server, ca,
token=creds['client_token'], user='root')
# Create kubernetes configuration for kubelet, and kube-proxy services.
create_kubeconfig(kubeconfig_path, server, ca,
token=creds['kubelet_token'], user='kubelet')
def configure_worker_services(api_servers, dns, cluster_cidr):
''' Add remaining flags for the worker services and configure snaps to use
them '''
layer_options = layer.options('tls-client')
ca_cert_path = layer_options.get('ca_certificate_path')
server_cert_path = layer_options.get('server_certificate_path')
server_key_path = layer_options.get('server_key_path')
kubelet_opts = FlagManager('kubelet')
kubelet_opts.add('require-kubeconfig', 'true')
kubelet_opts.add('kubeconfig', kubeconfig_path)
kubelet_opts.add('network-plugin', 'cni')
kubelet_opts.add('v', '0')
kubelet_opts.add('address', '0.0.0.0')
kubelet_opts.add('port', '10250')
kubelet_opts.add('cluster-dns', dns['sdn-ip'])
kubelet_opts.add('cluster-domain', dns['domain'])
kubelet_opts.add('anonymous-auth', 'false')
kubelet_opts.add('client-ca-file', ca_cert_path)
kubelet_opts.add('tls-cert-file', server_cert_path)
kubelet_opts.add('tls-private-key-file', server_key_path)
kubelet_opts.add('logtostderr', 'true')
kube_proxy_opts = FlagManager('kube-proxy')
kube_proxy_opts.add('cluster-cidr', cluster_cidr)
kube_proxy_opts.add('kubeconfig', kubeconfig_path)
kube_proxy_opts.add('logtostderr', 'true')
kube_proxy_opts.add('v', '0')
kube_proxy_opts.add('master', random.choice(api_servers), strict=True)
if b'lxc' in check_output('virt-what', shell=True):
kube_proxy_opts.add('conntrack-max-per-core', '0')
cmd = ['snap', 'set', 'kubelet'] + kubelet_opts.to_s().split(' ')
check_call(cmd)
cmd = ['snap', 'set', 'kube-proxy'] + kube_proxy_opts.to_s().split(' ')
check_call(cmd)
def create_kubeconfig(kubeconfig, server, ca, key=None, certificate=None,
user='ubuntu', context='juju-context',
cluster='juju-cluster', password=None, token=None):
'''Create a configuration for Kubernetes based on path using the supplied
arguments for values of the Kubernetes server, CA, key, certificate, user
context and cluster.'''
if not key and not certificate and not password and not token:
raise ValueError('Missing authentication mechanism.')
# token and password are mutually exclusive. Error early if both are
# present. The developer has requested an impossible situation.
# see: kubectl config set-credentials --help
if token and password:
raise ValueError('Token and Password are mutually exclusive.')
# Create the config file with the address of the master server.
cmd = 'kubectl config --kubeconfig={0} set-cluster {1} ' \
'--server={2} --certificate-authority={3} --embed-certs=true'
check_call(split(cmd.format(kubeconfig, cluster, server, ca)))
# Delete old users
cmd = 'kubectl config --kubeconfig={0} unset users'
check_call(split(cmd.format(kubeconfig)))
# Create the credentials using the client flags.
cmd = 'kubectl config --kubeconfig={0} ' \
'set-credentials {1} '.format(kubeconfig, user)
if key and certificate:
cmd = '{0} --client-key={1} --client-certificate={2} '\
'--embed-certs=true'.format(cmd, key, certificate)
if password:
cmd = "{0} --username={1} --password={2}".format(cmd, user, password)
# This is mutually exclusive from password. They will not work together.
if token:
cmd = "{0} --token={1}".format(cmd, token)
check_call(split(cmd))
# Create a default context with the cluster.
cmd = 'kubectl config --kubeconfig={0} set-context {1} ' \
'--cluster={2} --user={3}'
check_call(split(cmd.format(kubeconfig, context, cluster, user)))
# Make the config use this new context.
cmd = 'kubectl config --kubeconfig={0} use-context {1}'
check_call(split(cmd.format(kubeconfig, context)))
def launch_default_ingress_controller():
''' Launch the Kubernetes ingress controller & default backend (404) '''
context = {}
context['arch'] = arch()
addon_path = '/root/cdk/addons/{}'
# Render the default http backend (404) replicationcontroller manifest
manifest = addon_path.format('default-http-backend.yaml')
render('default-http-backend.yaml', manifest, context)
hookenv.log('Creating the default http backend.')
try:
kubectl('apply', '-f', manifest)
except CalledProcessError as e:
hookenv.log(e)
hookenv.log('Failed to create default-http-backend. Will attempt again next update.') # noqa
hookenv.close_port(80)
hookenv.close_port(443)
return
# Render the ingress replication controller manifest
manifest = addon_path.format('ingress-replication-controller.yaml')
render('ingress-replication-controller.yaml', manifest, context)
hookenv.log('Creating the ingress replication controller.')
try:
kubectl('apply', '-f', manifest)
except CalledProcessError as e:
hookenv.log(e)
hookenv.log('Failed to create ingress controller. Will attempt again next update.') # noqa
hookenv.close_port(80)
hookenv.close_port(443)
return
set_state('kubernetes-worker.ingress.available')
hookenv.open_port(80)
hookenv.open_port(443)
def restart_unit_services():
'''Restart worker services.'''
hookenv.log('Restarting kubelet and kube-proxy.')
services = ['kube-proxy', 'kubelet']
for service in services:
service_restart('snap.%s.daemon' % service)
def get_kube_api_servers(kube_api):
'''Return the kubernetes api server address and port for this
relationship.'''
hosts = []
# Iterate over every service from the relation object.
for service in kube_api.services():
for unit in service['hosts']:
hosts.append('https://{0}:{1}'.format(unit['hostname'],
unit['port']))
return hosts
def kubectl(*args):
''' Run a kubectl cli command with a config file. Returns stdout and throws
an error if the command fails. '''
command = ['kubectl', '--kubeconfig=' + kubeconfig_path] + list(args)
hookenv.log('Executing {}'.format(command))
return check_output(command)
def kubectl_success(*args):
''' Runs kubectl with the given args. Returns True if succesful, False if
not. '''
try:
kubectl(*args)
return True
except CalledProcessError:
return False
def kubectl_manifest(operation, manifest):
''' Wrap the kubectl creation command when using filepath resources
:param operation - one of get, create, delete, replace
:param manifest - filepath to the manifest
'''
# Deletions are a special case
if operation == 'delete':
# Ensure we immediately remove requested resources with --now
return kubectl_success(operation, '-f', manifest, '--now')
else:
# Guard against an error re-creating the same manifest multiple times
if operation == 'create':
# If we already have the definition, its probably safe to assume
# creation was true.
if kubectl_success('get', '-f', manifest):
hookenv.log('Skipping definition for {}'.format(manifest))
return True
# Execute the requested command that did not match any of the special
# cases above
return kubectl_success(operation, '-f', manifest)
@when('nrpe-external-master.available')
@when_not('nrpe-external-master.initial-config')
def initial_nrpe_config(nagios=None):
set_state('nrpe-external-master.initial-config')
update_nrpe_config(nagios)
@when('kubernetes-worker.config.created')
@when('nrpe-external-master.available')
@when_any('config.changed.nagios_context',
'config.changed.nagios_servicegroups')
def update_nrpe_config(unused=None):
services = ('snap.kubelet.daemon', 'snap.kube-proxy.daemon')
hostname = nrpe.get_nagios_hostname()
current_unit = nrpe.get_nagios_unit_name()
nrpe_setup = nrpe.NRPE(hostname=hostname)
nrpe.add_init_service_checks(nrpe_setup, services, current_unit)
nrpe_setup.write()
@when_not('nrpe-external-master.available')
@when('nrpe-external-master.initial-config')
def remove_nrpe_config(nagios=None):
remove_state('nrpe-external-master.initial-config')
# List of systemd services for which the checks will be removed
services = ('snap.kubelet.daemon', 'snap.kube-proxy.daemon')
# The current nrpe-external-master interface doesn't handle a lot of logic,
# use the charm-helpers code for now.
hostname = nrpe.get_nagios_hostname()
nrpe_setup = nrpe.NRPE(hostname=hostname)
for service in services:
nrpe_setup.remove_check(shortname=service)
def set_privileged():
"""Update the allow-privileged flag for kubelet.
"""
privileged = hookenv.config('allow-privileged')
if privileged == 'auto':
gpu_enabled = is_state('kubernetes-worker.gpu.enabled')
privileged = 'true' if gpu_enabled else 'false'
flag = 'allow-privileged'
hookenv.log('Setting {}={}'.format(flag, privileged))
kubelet_opts = FlagManager('kubelet')
kubelet_opts.add(flag, privileged)
if privileged == 'true':
set_state('kubernetes-worker.privileged')
else:
remove_state('kubernetes-worker.privileged')
@when('config.changed.allow-privileged')
@when('kubernetes-worker.config.created')
def on_config_allow_privileged_change():
"""React to changed 'allow-privileged' config value.
"""
set_state('kubernetes-worker.restart-needed')
remove_state('config.changed.allow-privileged')
@when('cuda.installed')
@when('kubernetes-worker.config.created')
@when_not('kubernetes-worker.gpu.enabled')
def enable_gpu():
"""Enable GPU usage on this node.
"""
config = hookenv.config()
if config['allow-privileged'] == "false":
hookenv.status_set(
'active',
'GPUs available. Set allow-privileged="auto" to enable.'
)
return
hookenv.log('Enabling gpu mode')
try:
# Not sure why this is necessary, but if you don't run this, k8s will
# think that the node has 0 gpus (as shown by the output of
# `kubectl get nodes -o yaml`
check_call(['nvidia-smi'])
except CalledProcessError as cpe:
hookenv.log('Unable to communicate with the NVIDIA driver.')
hookenv.log(cpe)
return
kubelet_opts = FlagManager('kubelet')
if get_version('kubelet') < (1, 6):
hookenv.log('Adding --experimental-nvidia-gpus=1 to kubelet')
kubelet_opts.add('experimental-nvidia-gpus', '1')
else:
hookenv.log('Adding --feature-gates=Accelerators=true to kubelet')
kubelet_opts.add('feature-gates', 'Accelerators=true')
# Apply node labels
_apply_node_label('gpu=true', overwrite=True)
_apply_node_label('cuda=true', overwrite=True)
set_state('kubernetes-worker.gpu.enabled')
set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.gpu.enabled')
@when_not('kubernetes-worker.privileged')
@when_not('kubernetes-worker.restart-needed')
def disable_gpu():
"""Disable GPU usage on this node.
This handler fires when we're running in gpu mode, and then the operator
sets allow-privileged="false". Since we can no longer run privileged
containers, we need to disable gpu mode.
"""
hookenv.log('Disabling gpu mode')
kubelet_opts = FlagManager('kubelet')
if get_version('kubelet') < (1, 6):
kubelet_opts.destroy('experimental-nvidia-gpus')
else:
kubelet_opts.remove('feature-gates', 'Accelerators=true')
# Remove node labels
_apply_node_label('gpu', delete=True)
_apply_node_label('cuda', delete=True)
remove_state('kubernetes-worker.gpu.enabled')
set_state('kubernetes-worker.restart-needed')
@when('kubernetes-worker.gpu.enabled')
@when('kube-control.connected')
def notify_master_gpu_enabled(kube_control):
"""Notify kubernetes-master that we're gpu-enabled.
"""
kube_control.set_gpu(True)
@when_not('kubernetes-worker.gpu.enabled')
@when('kube-control.connected')
def notify_master_gpu_not_enabled(kube_control):
"""Notify kubernetes-master that we're not gpu-enabled.
"""
kube_control.set_gpu(False)
@when('kube-control.connected')
def request_kubelet_and_proxy_credentials(kube_control):
""" Request kubelet node authorization with a well formed kubelet user.
This also implies that we are requesting kube-proxy auth. """
# The kube-cotrol interface is created to support RBAC.
# At this point we might as well do the right thing and return the hostname
# even if it will only be used when we enable RBAC
nodeuser = 'system:node:{}'.format(gethostname())
kube_control.set_auth_request(nodeuser)
@when('kube-control.auth.available')
def catch_change_in_creds(kube_control):
"""Request a service restart in case credential updates were detected."""
creds = kube_control.get_auth_credentials()
if data_changed('kube-control.creds', creds):
set_state('kubernetes-worker.restart-needed')
@when_not('kube-control.connected')
def missing_kube_control():
"""Inform the operator they need to add the kube-control relation.
If deploying via bundle this won't happen, but if operator is upgrading a
a charm in a deployment that pre-dates the kube-control relation, it'll be
missing.
"""
hookenv.status_set(
'blocked',
'Relate {}:kube-control kubernetes-master:kube-control'.format(
hookenv.service_name()))
def _systemctl_is_active(application):
''' Poll systemctl to determine if the application is running '''
cmd = ['systemctl', 'is-active', application]
try:
raw = check_output(cmd)
return b'active' in raw
except Exception:
return False
class ApplyNodeLabelFailed(Exception):
pass
def _apply_node_label(label, delete=False, overwrite=False):
''' Invoke kubectl to apply node label changes '''
hostname = gethostname()
# TODO: Make this part of the kubectl calls instead of a special string
cmd_base = 'kubectl --kubeconfig={0} label node {1} {2}'
if delete is True:
label_key = label.split('=')[0]
cmd = cmd_base.format(kubeconfig_path, hostname, label_key)
cmd = cmd + '-'
else:
cmd = cmd_base.format(kubeconfig_path, hostname, label)
if overwrite:
cmd = '{} --overwrite'.format(cmd)
cmd = cmd.split()
deadline = time.time() + 60
while time.time() < deadline:
code = subprocess.call(cmd)
if code == 0:
break
hookenv.log('Failed to apply label %s, exit code %d. Will retry.' % (
label, code))
time.sleep(1)
else:
msg = 'Failed to apply label %s' % label
raise ApplyNodeLabelFailed(msg)
def _parse_labels(labels):
''' Parse labels from a key=value string separated by space.'''
label_array = labels.split(' ')
sanitized_labels = []
for item in label_array:
if '=' in item:
sanitized_labels.append(item)
else:
hookenv.log('Skipping malformed option: {}'.format(item))
return sanitized_labels

View file

@ -0,0 +1,6 @@
apiVersion: v1
data:
body-size: 1024m
kind: ConfigMap
metadata:
name: nginx-load-balancer-conf

View file

@ -0,0 +1,43 @@
apiVersion: v1
kind: ReplicationController
metadata:
name: default-http-backend
spec:
replicas: 1
selector:
app: default-http-backend
template:
metadata:
labels:
app: default-http-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.0
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: default-http-backend
labels:
app: default-http-backend
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: default-http-backend

View file

@ -0,0 +1,53 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-load-balancer-conf
---
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-ingress-controller
labels:
k8s-app: nginx-ingress-lb
spec:
replicas: 1
selector:
k8s-app: nginx-ingress-lb
template:
metadata:
labels:
k8s-app: nginx-ingress-lb
name: nginx-ingress-lb
spec:
terminationGracePeriodSeconds: 60
# hostPort doesn't work with CNI, so we have to use hostNetwork instead
# see https://github.com/kubernetes/kubernetes/issues/23920
hostNetwork: true
containers:
- image: gcr.io/google_containers/nginx-ingress-controller:0.8.3
name: nginx-ingress-lb
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- containerPort: 80
- containerPort: 443
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --nginx-configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf

View file

@ -0,0 +1,63 @@
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: microbot
name: microbot
spec:
replicas: {{ replicas }}
selector:
matchLabels:
app: microbot
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: microbot
spec:
containers:
- image: dontrebootme/microbot:v1
imagePullPolicy: ""
name: microbot
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
timeoutSeconds: 30
resources: {}
restartPolicy: Always
serviceAccountName: ""
status: {}
---
apiVersion: v1
kind: Service
metadata:
name: microbot
labels:
app: microbot
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: microbot
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: microbot-ingress
spec:
rules:
- host: microbot.{{ public_address }}.xip.io
http:
paths:
- path: /
backend:
serviceName: microbot
servicePort: 80

View file

@ -0,0 +1,118 @@
apiVersion: v1
kind: Secret
metadata:
name: registry-tls-data
type: Opaque
data:
tls.crt: {{ tlscert }}
tls.key: {{ tlskey }}
---
apiVersion: v1
kind: Secret
metadata:
name: registry-auth-data
type: Opaque
data:
htpasswd: {{ htpasswd }}
---
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-registry-v0
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-registry
version: v0
template:
metadata:
labels:
k8s-app: kube-registry
version: v0
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
env:
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
- name: REGISTRY_AUTH_HTPASSWD_REALM
value: basic_realm
- name: REGISTRY_AUTH_HTPASSWD_PATH
value: /auth/htpasswd
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
- name: auth-dir
mountPath: /auth
ports:
- containerPort: 5000
name: registry
protocol: TCP
volumes:
- name: image-store
hostPath:
path: /srv/registry
- name: auth-dir
secret:
secretName: registry-auth-data
---
apiVersion: v1
kind: Service
metadata:
name: kube-registry
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "KubeRegistry"
spec:
selector:
k8s-app: kube-registry
type: LoadBalancer
ports:
- name: registry
port: 5000
protocol: TCP
---
apiVersion: v1
kind: Secret
metadata:
name: registry-access
data:
.dockercfg: {{ dockercfg }}
type: kubernetes.io/dockercfg
{%- if ingress %}
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: registry-ing
spec:
tls:
- hosts:
- {{ domain }}
secretName: registry-tls-data
rules:
- host: {{ domain }}
http:
paths:
- backend:
serviceName: kube-registry
servicePort: 5000
path: /
{% endif %}

View file

@ -0,0 +1 @@
charms.templating.jinja2>=0.0.1,<2.0.0