1. 程式人生 > >Kubernetes NetworkPolicy工作原理淺析

Kubernetes NetworkPolicy工作原理淺析

-s names 兩個 簡單介紹 img cit ali svc protocol

Kubernetes能夠把集群中不同Node節點上的Pod連接起來,並且默認情況下,每個Pod之間是可以相互訪問的。但在某些場景中,不同的Pod不應該互通,這個時候就需要進行訪問控制。那麽如何實現呢?

簡介

??Kubernetes提供了NetworkPolicy的Feature,支持按Namespace和按Pod級別的網絡訪問控制。它利用label指定namespaces或pod,底層用iptables實現。這篇文章簡單介紹Kubernetes NetworkPolicy在Calico上的工作原理。

控制面數據流

??Network Policy是一種kubernetes資源,經過定義、存儲、配置等流程使其生效。以下是簡要流程:
技術分享圖片

  • 通過kubectl client創建network policy資源;
  • calico的policy-controller監聽network policy資源,獲取到後寫入calico的etcd數據庫;
  • node上calico-felix從etcd數據庫中獲取policy資源,調用iptables做相應配置。

資源配置模板

??Network Policy支持按Pod和Namespace級別的訪問控制,定義該資源可以參考以下模板。

指定pod標簽訪問

??我們要對namespace為myns,帶有"role: backend"標簽的所有pod進行訪問控制:只允許標簽為"role: frontend"的Pod,並且TCP端口為6379的數據流入,其他流量都不允許。

kind: NetworkPolicy
apiVersion: extensions/v1beta1 
metadata:
  name: allow-frontend
  namespace: myns
spec:
  podSelector:            
    matchLabels:
      role: backend
  ingress:                
    - from:              
        - podSelector:
            matchLabels:
              role: frontend
      ports:
- protocol: TCP port: 6379

指定namespaces標簽訪問

??我們要對標簽為"role: frontend"的所有Pod進行訪問控制:只允許namespace標簽為"user: bob"的各Pod,並且TCP端口為443的數據流入,其他流量都不允許。

kind: NetworkPolicy
apiVersion: extensions/v1beta1 
metadata:
  name: allow-tcp-443
spec:
  podSelector:            
    matchLabels:
      role: frontend 
  ingress:
    - ports:
        - protocol: TCP
          port: 443 
      from:
        - namespaceSelector:
            matchLabels:
              user: bob 

NetworkPolicy數據結構定義

??看完上邊的示例,,想必大家對NetworkPolicy的資源對象有一定的了解。接下來我們具體看下Kubernetes對該接口的定義:

type NetworkPolicy struct {
    TypeMeta
    ObjectMeta
    Spec NetworkPolicySpec 
}

type NetworkPolicySpec struct {
    PodSelector unversioned.LabelSelector `json:"podSelector"`
    Ingress []NetworkPolicyIngressRule `json:"ingress,omitempty"`
}

type NetworkPolicyIngressRule struct {
    Ports *[]NetworkPolicyPort `json:"ports,omitempty"`
    From *[]NetworkPolicyPeer `json:"from,omitempty"`
}

type NetworkPolicyPort struct {
    Protocol *api.Protocol `json:"protocol,omitempty"`
    Port *intstr.IntOrString `json:"port,omitempty"`
}

type NetworkPolicyPeer struct {
    PodSelector *unversioned.LabelSelector `json:"podSelector,omitempty"`
    NamespaceSelector *unversioned.LabelSelector `json:"namespaceSelector,omitempty"`
}

??簡而言之,該資源指定了“被控制訪問Pod”和“準入Pod”兩類Pod,這可以從spec的podSelector和ingress-from的Selector進行配置。

??接下來我們就看下Kubernetes+Calico的Network policy實現細節。

測試版本

??以下是測試中使用的組件版本:

  • kubernetes:
  • master: v1.9.0
  • node: v1.9.0
  • calico:
  • v2.5.0
  • calico-policy-controller
    • quay.io/calico/kube-policy-controller:v0.7.0

運行配置

  • calico側,除基本配置外的新建資源:
  • service-account: calico-policy-controller
  • rbac:
    • ServiceRole: calico-policy-controller
    • ServiceRoleBinding: calico-policy-controller
  • deployment: calico-policy-controller
  • Kubernets側,新建network policy資源;

運行狀態

??在原有正常工作的Kubernetes集群上,我們新加了calico-policy-controller容器,它裏面主要運行controller進程:

  • calico-policy-controller:
  • 進程

    / # ps aux
     PID   USER     TIME   COMMAND
      1   root       0:00 /pause
      7   root       0:00 /dist/controller
     13   root       0:12 /dist/controller
  • 端口:

      / # netstat -apn | grep contr
       tcp        0      0 10.138.102.219:45488    10.138.76.26:2379       ESTABLISHED 13/controller
       tcp        0      0 10.138.102.219:44538    101.199.110.26:6443     ESTABLISHED 13/controller

??我們可以看到,啟動了controller進程,該進程Established兩個端口:6443對應的kubernetes api-server端口;2379對應的calico etcd端口。

Calico-felix對policy的配置

數據包走向

??下圖是calico流量處理流程(從這裏找到)。每個Node的calico-felix從etcd數據庫拿下來policy信息,用iptables做底層實現,最主要的就是:cali-pi-[POLICY]@filter 這個Chain。

Network Policy報文處理過程中使用的標記位:

0x2000000: 是否已經經過了policy規則檢測,置1表示已經過

符號解釋:

from-XXX: XXX發出的報文;
tw: 簡寫,to wordkoad endpoint;
to-XXX: 發送到XXX的報文;
po: 簡寫,policy outbound;
cali-: 前綴,calico的規則鏈;
pi: 簡寫,policy inbound;
wl: 簡寫,workload endpoint;
pro: 簡寫,profile outbound;
fw: 簡寫,from workload endpoint;
pri: 簡寫,profile inbound。

(receive pkt)
cali-PREOUTING@raw -> cali-from-host-endpoint@raw -> cali-PREROUTING@nat
                   |                                 ^        |
                   |          (-i cali+)             |        |
                   +--- (from workload endpoint) ----+        |
                                                              |
            (dest  may be container‘s floating ip)   cali-fip-dnat@nat
                                                              |
                                                     (rotuer decision)
                                                              |
                     +--------------------------------------------+
                     |                                            |
            cali-INPUT@filter                             cali-FORWARD@filter
         (-i cali+)  |                               (-i cali+)   |    (-o cali+)
         +----------------------------+              +------------+-------------+
         |                            |              |            |             |
 cali-wl-to-host           cali-from-host-endpoint   |  cali-from-host-endpoint |
     @filter                       @filter           |         @filter          |
         |                         < END >           |            |             |
         |                                           |   cali-to-host-endpoint  |
         |                                           |         @filter          |
         |                     will return to nat‘s  |         < END >          |
         |                       cali-POSTROUTING    |                          |
 cali-from-wl-dispatch@filter  <---------------------+   cali-to-wl-dispatch@filter
                      |         \--------------+                       |
          +-----------------------+            |           +----------------------+
          |                       |            |           |                      |
 cali-fw-cali0ef24b1     cali-fw-cali0ef24b2   |  cali tw-cali03f24b1   cali-tw-cali03f24b2
      @filter                 @filter          |       filter                  @filter
  (-i cali0ef24b1)          (-i cali0ef24b2)   |   (-o cali0ef24b1)        (-o cali0ef24b2)
          |                       |            |           |                      |
          +-----------------------+            |           +----------------------+
                      |                        |                       |
           cali-po-[POLICY]@filter             |            cali-pi-[POLICY]@filter
                      |                        |                       |
          cali-pro-[PROFILE]@filter            |           cali-pri-[PROFILE]@filter
                      |                        |                       |
                   < END >                     +------------> cali-POSTROUTING@nat
                                               +---------->/           |
                                               |                cali-fip-snat@nat
                                               |                       |
                                               |              cali-nat-outgoing@nat
                                               |                       |
                                               |       (if dip is local: send to lookup)
                                     +---------+--------+   (else: send to nic‘s qdisc)
                                     |                  |           < END >    
                     cali-to-host-endpoint@filter       | 
                                     |                  | 
                                     +------------------+ 
                                               ^ (-o cali+)
                                               | 
                                       cali-OUTPUT@filter
                                               ^    
(send pkt)                                     | 
(router descition) -> cali-OUTPUT@nat -> cali-fip-dnat@nat

??下面通過訪問“禁止所有流量”策略的Pod,來觀察對應的iptables處理:

流量進入前

[root@host31 ~]# iptables -nxvL cali-tw-cali1f79f9e08f2 -t filter
Chain cali-tw-cali1f79f9e08f2 (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:fthBuDq5I1oklYOL */ /* Start of policies */ MARK and 0xfdffffff
       0        0 cali-pi-default.web-deny-all  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:Kp-Liqb4hWavW9dD */ mark match 0x0/0x2000000
       0        0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:Qe6UBTrru3RfK2MB */ /* Drop if no policies passed packet */ mark match 0x0/0x2000000

流量進入後

    [root@host31 ~]# iptables -nxvL cali-tw-cali1f79f9e08f2 -t filter
    Chain cali-tw-cali1f79f9e08f2 (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       3      180 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:fthBuDq5I1oklYOL */ /* Start of policies */ MARK and 0xfdffffff
       3      180 cali-pi-default.web-deny-all  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:Kp-Liqb4hWavW9dD */ mark match 0x0/0x2000000
       3      180 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:Qe6UBTrru3RfK2MB */ /* Drop if no policies passed packet */ mark match 0x0/0x2000000

??可以看到,DROP的pkts由0變成了3。即該數據包經過MARK、cali-pi-default.web-deny-all兩個target處理,被標記符合“拒絕”條件,流經到DROP被丟棄。

流程分析案例

??以下是一個“禁止所有流量進入”的測試案例,通過它看下整體流程。

模型

  • DENY all traffic to an application
    技術分享圖片

查看app-web的標簽

??在default的namespace下創建了一個名稱為web的service。它的IP和標簽如下:

[root@host02 /home/test]# kubectl get service --all-namespaces | grep web
default       web                       ClusterIP   192.168.82.141    <none>        80/TCP              1d

[root@host02 /home/test/]# kubectl get pod --all-namespaces -o wide --show-labels | grep web
default        web-667bdcb4d8-cpvbb                        1/1       Running            0          1d        10.139.54.158    host30.add.bjdt.qihoo.net   app=web,pod-template-hash=2236876084

配置policy

??首先,通過kubectl查看k8s資源:

[root@host02 /home/test]# kubectl get networkpolicy web-deny-all -o yaml
- apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    name: web-deny-all
    namespace: default
  spec:
    podSelector:
      matchLabels:
        app: web
    policyTypes:
    - Ingress

??接下來,通過calicoctl和etcdctl查看calico資源:

[root@host02 /home/test]# calicoctl get policy default.web-deny-all -o yaml
- apiVersion: v1
  kind: policy
  metadata:
    name: default.web-deny-all
  spec:
    egress:
    - action: allow
      destination: {}
      source: {}
    order: 1000
    selector: calico/k8s_ns == ‘default‘ && app == ‘web‘ 
    
[root@host02 /home/test]# /home/test/etcdctl-wrapper-v2.sh get /calico/v1/policy/tier/default/policy/default.web-deny-all
{"outbound_rules": [{"action": "allow"}], "order": 1000, "inbound_rules": [], "selector": "calico/k8s_ns == ‘default‘ && app == ‘web‘"}

查看felix進行Network Policy配置的日誌

增加 && 刪除Policy

2018-02-11 11:13:22.029 [INFO][257] label_inheritance_index.go 203: Updating selector selID=Policy(name=default.api-allow)
2018-02-11 09:39:35.642 [INFO][257] label_inheritance_index.go 209: Deleting selector Policy(name=default.api-allow)

查看node上的iptables規則

[root@host30 ~]# iptables -nxvL cali-tw-cali96bc57f337a
Chain cali-tw-cali96bc57f337a (1 references)
    pkts      bytes target     prot opt in     out     source               destination
       0        0 ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:oSVcrqJ8U46FxQEJ */ ctstate RELATED,ESTABLISHED
       0        0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:nudTdCphcvic4flm */ ctstate INVALID
       2      120 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:QWGVPDFBXrYgBHjv */ MARK and 0xfeffffff
       2      120 MARK       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:fnpcHeCllWo_kg1u */ /* Start of policies */ MARK and 0xfdffffff
       2      120 cali-pi-default.web-deny-all  all  --  *  *   0.0.0.0/0            0.0.0.0/0            /* cali:ibEcyP2JurQBR2JS */ mark match 0x0/0x2000000
       0        0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:dIb1kwxUZz8DgRje */ /* Return if policy accepted */ mark match 0x1000000/
0x1000000
       2      120 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:1O4PxUpswz0ZqJnr */ /* Drop if no policies passed packet */ mark match 0x
0/0x2000000
       0        0 cali-pri-k8s-pod-network  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:rb9GDlntQSXL3Sen */
       0        0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:s2lDMKnLGp_JSpKk */ /* Return if profile accepted */ mark match 0x1000000
/0x1000000
       0        0 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* cali:q8OkJmM7E9TcFsQr */ /* Drop if no profiles matched */

從另一pod上訪問該服務

[root@host02 /home/test]# kubectl run --rm -i -t --image=alpine test-$RANDOM -- sh
If you don‘t see a command prompt, try pressing enter.
/ # wget -qO- --timeout=3 http://192.168.82.141:80
wget: download timed out
/ #

??可見,訪問該service的80端口失敗;ping所對應的Pod試試:

[root@web-test-74b4dbb994-5zcvq /]# ping 10.139.54.158
PING 10.139.54.158 (10.139.54.158) 56(84) bytes of data.

^C
--- 10.139.54.158 ping statistics ---
45 packets transmitted, 0 received, 100% packet loss, time 44000ms

??Ping該Pod也是失敗,達到了“禁止所有流量進入”的預期。

總結

??Kubernetes的NetworkPolicy實現了訪問控制,解決了部分網絡安全的問題。但截至現在,Kubernetes、Calico對其支持尚未完全,部分特性(egress等)仍在進行中;另一方面calico的每個Node上配置大量iptables規則,加上不同維度控制的增加,導致運維、排障難度較大。所以對網絡訪問控制有需求的用戶來講,能否使用還需綜合考慮。

參考資料:

  • Securing Kubernetes Cluster Networking
  • GitHub: ahmetb/kubernetes-network-policy-recipes
  • NetworkPolicy API
  • Calico網絡的原理、組網方式與使用

Kubernetes NetworkPolicy工作原理淺析