1. 程式人生 > >Get Kubeflow up and running on a private cloud

Get Kubeflow up and running on a private cloud

Today more and more companies use artificial intelligence (AI) to improve the user experiences for their products. These enterprises have the following goals:

  • To provide an easily configureable and scalable environment for their data scientists.
  • To protect their proprietary data by running AI workloads in their own data centers.
  • To have full control over and accessibility to the environment.

IBM Cloud Private and Kubeflow are the tools that can fulfill those requirements.

Kubeflow is an open source project that is designed to make deployments of machine learning workloads easy, portable, and scalable on Kubernetes environments.

IBM Cloud Private is an integrated environment for managing containers that includes the container orchestrator Kubernetes, a private image registry, a management console, and monitoring frameworks – all running within your data center. IBM Cloud Private-Community Edition provides a limited offering that is available at no charge and ideal for test environments.

Because IBM Cloud Private is Kubernetes-based and is designed to be deployed on enterprise data centers, using Kubeflow on IBM Cloud Private is the right match for a portable and scalable on-premises solution for enterprises.

This tutorial shows how to set up both Kubeflow and IBM Cloud Private-Community Edition to work together in a private cloud environment where your data is protected on your own data center.

Learning objectives

You learn the steps to deploy Kubeflow on IBM Cloud Private Community Edition version 2.1.0.3 (with Kubernetes v1.10) using Ksonnet.

Prerequisites

One Ubuntu 16.04.4 servers running on bare metal or VMs with GPUs.

Estimated time

Total time: approximately 2 hours

  • Set up IBM Cloud Private-Community Edition with GPU support: approximately 1 hour

  • Set up and verify Kubeflow on IBM Cloud Private Community Edition: approximately 1 hour

Steps

Complete the following steps to set up IBM Cloud Private and Kubeflow to work together.

1. Set up IBM Cloud Private-Community Edition

To set up IBM Cloud Private from scratch, first follow the steps in Preparing your cluster for installation to prepare the nodes for setting up IBM Cloud Private.

Then, follow the steps in Installing IBM Cloud Private-CE to set up IBM Cloud Private with single or multiple worker node(s) configuration.

2. Set up Kubernetes CLI client for your IBM Cloud Private cluster

To install kubectl on the master node, run the following commands:

sudo curl -L https://storage.googleapis.com/kubernetes-release/release/v1.10.0/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl

sudo chmod +x /usr/local/bin/kubectl

Now you configure kubectl on the master node to access your IBM Cloud Private cluster. Complete the following steps:

  1. Click the user icon in the upper right corner of the management console, and select Configure client.

  2. Copy the configuration commands.

  3. Open a terminal window to the master node. Paste and then run the configuration commands that you copied from the previous step. kubectl on the master node is now set up to access your IBM Cloud Private cluster, but this configuration expires in 12 hours. You need to configure kubectl to use service account token with the following steps.

  4. Get the existing service account secret name.

     $ kubectl get secret
     NAME                  TYPE                                  DATA      AGE
     calico-etcd-secrets   Opaque                                3         19h
     default-token-b9pfk   kubernetes.io/service-account-token   3         19h
    

    Write down the secret name of your service-account-token (for example, default-token-b9pfk in the previous example) and use it in the next step.

  5. Use the service account token as your access credentials.

    Run the following command with the service account token secret name you got from the previous step:

     $ kubectl config set-credentials admin --token=$(kubectl get secret <your-token-secret-name> -o jsonpath={.data.token} |       base64 -d)
     User "admin" set.
    

3. Set up IBM Cloud Private to enable GPU support

  1. First, you install Nvidia GPU driver on each worker node.

    Either use the already modified driver-installer.yaml file or get the driver installer file from Google Cloud github then remove the affinity section in the yaml file:

     wget https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer    /ubuntu/daemonset.yaml -O driver-installer.yaml
    

    Deploy the daemon set from the master node will install the driver on each worker nodes.

     # Launch the daemonset
     kubectl create -f driver-installer.yaml
    

    To verify the driver is installed properly, do the following.

     # Verify the driver is installed
     kubectl describe ds nvidia-driver-installer -n kube-system
    
     # ssh to your worker node(s) and run the following command on each node
     /home/kubernetes/bin/nvidia/bin/nvidia-smi
    
  2. Deploy the Kubernetes device plugin for Nvidia GPUs on your IBM Cloud Private cluster.

    Either use this already modified device-plugin.yaml file OR get the device plugin file from K Kubernetes github then remove the affinity section in the yaml file.

     wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/device-plugins/nvidia-    gpu/daemonset.yaml -O device-plugin.yaml
    

    Deploy the daemon set from the master node will install the device plugin on each worker nodes.

     # Launch the daemonset
     kubectl create -f device-plugin.yaml
     kubectl describe ds nvidia-gpu-device-plugin -n kube-system
    

    To verify GPU is enabled on each worker node, do the following.

     kubectl get nodes
     kubectl describe node <your-node-name> | grep Capacity -A 15
    

At this point, your IBM Cloud Private environment is running and ready to use, with GPU support.

4. Install Ksonnet

Download Ksonnet (v0.12.0 or more recent) on the master node from Ksonnet Github.

5. Install and set up Kubeflow

To install Kubeflow, run the following commands on the master node:

# Select a version of Kubeflow to use.
# Refer to https://github.com/kubeflow/kubeflow/releases
# for the complete list of available versions

export KUBEFLOW_VERSION=0.2.2

export KUBEFLOW_KS_DIR=</path/to/store/your/ksonnet/application>

export KUBEFLOW_DEPLOY=false

# Create a ksonnet application in ${KUBEFLOW_KS_DIR}
curl https://raw.githubusercontent.com/kubeflow/kubeflow/v${KUBEFLOW_VERSION}/scripts/deploy.sh | bash

# By default, the script enables Spartakus(the Kubernetes reporting tool) to
# collect anonymous usage data. If you want to disable it, you can run the
# following command
ks param set kubeflow-core reportUsage false

6. Create a Ksonnet environment for your IBM Cloud Private cluster

Run the following commands on the master node:

# Get your IBM Cloud Private context name
kubectl config current-context

# Create an environment for your IBM Cloud Private cluster
ks env add <your-context-name> --namespace ${NAMESPACE}

# Apply kubeflow-core component to your IBM Cloud Private cluster
ks apply <your-context-name> -c kubeflow-core

# Change the JupyterHub service to use NodePort instead of the default ClusterIP
ks param set kubeflow-core jupyterHubServiceType NodePort
ks apply <your-context-name>

7. Connect to the Jupyter notebook

Kubeflow includes JupyterHub for end-users to create and manage interactive Jupyter notebooks.

Complete the following steps to connect to the Jupyter Notebook:

  1. Get the port number for the tf-hub-lb service:

     kubectl get svc tf-hub-lb -n ${NAMESPACE}
    
  2. Open your-master-ip:your-tf-hub-lb-port in a browser.

  3. Sign in using any user name and password.

  4. Click the Start My Server button.

  5. Select an image.

  6. Allocate memory, CPU, GPU, and other resources accordingly.

  7. Click Spawn.

  8. Check the status of the Jupyter Notebook pod spawn:

     USERNAME=<username-used-to-signin-in-step-2>
     kubectl -n ${NAMESPACE} describe pods jupyter-${USERNAME}
    
  9. After the Jupyter notebook is up and running, the Jupyter notebook interface opens on the browser. Create a notebook by clicking New.

  10. Verify the installation of Kubeflow by running the following statements in your Jupyter notebook:

     import tensorflow as tf
    
     # Creates a graph.
     with tf.device('/device:GPU:0'):
       a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
       b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
       c = tf.matmul(a, b)
    
     # Creates a session.
     sess = tf.InteractiveSession()
    
     # Runs the op.
     print(sess.run(c))
    

    If the code does not produce any error, your environment is working properly.

Summary

This tutorial described the steps to deploy Kubeflow on an IBM Cloud Private cluster with GPU support. You can now try this on your own environment for a portable and scalable on-premises solution that protects your enterprise data in your own data center.