1. 程式人生 > >【kubernetes/k8s原始碼分析】kube proxy原始碼分析

【kubernetes/k8s原始碼分析】kube proxy原始碼分析

本文再次於2018年11月15日再次編輯,基於1.12版本,包括IPVS

序言

  • kube-proxy管理sevice的Endpoints,service對外暴露一個Virtual IP(Cluster IP), 叢集內Cluster IP:Port就能訪問到叢集內對應的serivce下的Pod。 service是通過Selector選擇的一組Pods的服務抽象

  • kube-proxy的主要作用就是service的實現。 service另一個作用是:一個服務後端的Pods可能會隨著生存滅亡而發生IP的改變,service的出現,給服務提供了一個固定的IP,無視後端Endpoint的變化。

  • kube-proxy 工作監聽 etcd(通過apiserver 的介面讀取 etcd),來實時更新節點上的 iptables

iptables,它完全利用核心iptables來實現service的代理和LB,iptables mode使用iptable NAT來轉發,存在效能損耗。如果叢集中上萬的Service/Endpoint,那麼Node上的iptables rules將會非常龐大。

目前大部,都不會直接用kube-proxy作為服務代理,通過自己開發或者通過Ingress Controller。

程序:

/usr/sbin/kube-proxy

--bind-address=10.12.51.186

--hostname-override=10.12.51.186

--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig

--logtostderr=false --log-dir=/var/log/kubernetes/kube-proxy --v=4

iptables

    iptables 有哪些鏈

  • INPUT:進入主機
  • OUTPUT:離開主機
  • PREROUTING:路由前
  • FORWARD:轉發
  • POSTROUTING:路由後

   iptables 提供了哪些表

  • filter:負責過濾功能,核心模組 iptables_filter
  • nat:負責進行網路地址轉換,核心模組 iptable_nat
  • mangle:拆解報文,進行修改,重新封裝,核心模組 iptable_mangle
  • raw:關閉 nat 表上啟用的連線追蹤機制,核心模組 iptable_raw
  • security:CentOS 7 裡新增的表,暫且不介紹

    處理動作

  • iptables 中稱為 target
  • ACCEPT:允許資料包通過。
  • DROP:直接丟棄資料包。不迴應任何資訊,客戶端只有當該連結超時後才會有反應。
  • REJECT:拒絕資料包。會給客戶端傳送一個響應的資訊 。
  • SNAT:源 NAT,解決私網使用者用同一個公網 IP 上網的問題。
  • MASQUERADE:是 SNAT 的一種特殊形式,適用於動態的、臨時會變的 IP 上。
  • DNAT:目的 NAT,解決私網服務端,接收公網請求的問題。
  • REDIRECT:在本機做埠對映。
  • LOG:在 /etc/log/messages 中留下記錄,但並不對資料包進行任何操作。

到本機某程序的報文: PREROUTING --> INPUT

由本機轉發的報文: PREROUTING --> FORWARD --> POSTROUTING

由本機的某程序發出的報文(通常為相應報文): OUTPUT --> POSTROUTING

connection tracking

    很多iptables的功能需要藉助connection tracking實現,所以當資料包經過iptables在raw表處理之後mangle表處理之前,會進行connection tracking處理,/proc/net/ip_conntrack裡面儲存了所有被跟蹤的連線,從iptables的connection tracking機制可以得知當前資料包所對應的連線的狀態:

  • NEW,該包請求一個新連線,現在沒有對應該資料包的連線
  • ESTABLISHED,資料包已經屬於某個連線
  • RELATED,資料包已經屬於某個連線,但是又請求一個新連線
  • INVALID,資料包不屬於任何連線

1. main函式

  • NewProxyCommand使用了cobra.Command
  • 初始化log,預設重新整理間隔30秒
func main() {
   command := app.NewProxyCommand()

   // TODO: once we switch everything over to Cobra commands, we can go back to calling
   // utilflag.InitFlags() (by removing its pflag.Parse() call). For now, we have to set the
   // normalize func and add the go flag set by hand.
   pflag.CommandLine.SetNormalizeFunc(utilflag.WordSepNormalizeFunc)
   pflag.CommandLine.AddGoFlagSet(goflag.CommandLine)
   // utilflag.InitFlags()
   logs.InitLogs()
   defer logs.FlushLogs()

   if err := command.Execute(); err != nil {
      fmt.Fprintf(os.Stderr, "error: %v\n", err)
      os.Exit(1)
   }
}

2. NewProxyCommand函式

  • 完成引數的初始化
  • 呼叫opts.Run()函式,第2.1章節講解  
// NewProxyCommand creates a *cobra.Command object with default parameters
func NewProxyCommand() *cobra.Command {
	opts := NewOptions()

	cmd := &cobra.Command{
		Use: "kube-proxy",
		Long: `The Kubernetes network proxy runs on each node. This
reflects services as defined in the Kubernetes API on each node and can do simple
TCP, UDP, and SCTP stream forwarding or round robin TCP, UDP, and SCTP forwarding across a set of backends.
Service cluster IPs and ports are currently found through Docker-links-compatible
environment variables specifying ports opened by the service proxy. There is an optional
addon that provides cluster DNS for these cluster IPs. The user must create a service
with the apiserver API to configure the proxy.`,
		Run: func(cmd *cobra.Command, args []string) {
			verflag.PrintAndExitIfRequested()
			utilflag.PrintFlags(cmd.Flags())

			if err := initForOS(opts.WindowsService); err != nil {
				glog.Fatalf("failed OS init: %v", err)
			}

			if err := opts.Complete(); err != nil {
				glog.Fatalf("failed complete: %v", err)
			}
			if err := opts.Validate(args); err != nil {
				glog.Fatalf("failed validate: %v", err)
			}
			glog.Fatal(opts.Run())
		},
	}

	var err error
	opts.config, err = opts.ApplyDefaults(opts.config)
	if err != nil {
		glog.Fatalf("unable to create flag defaults: %v", err)
	}

	opts.AddFlags(cmd.Flags())

	cmd.MarkFlagFilename("config", "yaml", "yml", "json")

	return cmd
}

  2.1 Run函式

  • NewProxyServer函式呼叫newProxyServer(第3章節講解)
  • Run函式,執行主要函式(第 4章節講解)
func (o *Options) Run() error {
	if len(o.WriteConfigTo) > 0 {
		return o.writeConfigFile()
	}

	proxyServer, err := NewProxyServer(o)
	if err != nil {
		return err
	}

	return proxyServer.Run()
}

3. newProxyServer

  主要工作是初始化proxy server,配置,一大堆需要呼叫的介面

  3.1 createClients

    建立k8s master的客戶端

	client, eventClient, err := createClients(config.ClientConnection, master)
	if err != nil {
		return nil, err
	}

  3.2 建立event Broadcaster和event recorder

	// Create event recorder
	hostname, err := utilnode.GetHostname(config.HostnameOverride)
	if err != nil {
		return nil, err
	}
	eventBroadcaster := record.NewBroadcaster()
	recorder := eventBroadcaster.NewRecorder(scheme, v1.EventSource{Component: "kube-proxy", Host: hostname})

  3.3 模式為iptables,主要初始化一大堆介面

	if proxyMode == proxyModeIPTables {
		glog.V(0).Info("Using iptables Proxier.")
		if config.IPTables.MasqueradeBit == nil {
			// MasqueradeBit must be specified or defaulted.
			return nil, fmt.Errorf("unable to read IPTables MasqueradeBit from config")
		}

		// TODO this has side effects that should only happen when Run() is invoked.
		proxierIPTables, err := iptables.NewProxier(
			iptInterface,
			utilsysctl.New(),
			execer,
			config.IPTables.SyncPeriod.Duration,
			config.IPTables.MinSyncPeriod.Duration,
			config.IPTables.MasqueradeAll,
			int(*config.IPTables.MasqueradeBit),
			config.ClusterCIDR,
			hostname,
			nodeIP,
			recorder,
			healthzUpdater,
			config.NodePortAddresses,
		)
		if err != nil {
			return nil, fmt.Errorf("unable to create proxier: %v", err)
		}
		metrics.RegisterMetrics()
		proxier = proxierIPTables
		serviceEventHandler = proxierIPTables
		endpointsEventHandler = proxierIPTables
		// No turning back. Remove artifacts that might still exist from the userspace Proxier.
		glog.V(0).Info("Tearing down inactive rules.")
		// TODO this has side effects that should only happen when Run() is invoked.
		userspace.CleanupLeftovers(iptInterface)
		// IPVS Proxier will generate some iptables rules, need to clean them before switching to other proxy mode.
		// Besides, ipvs proxier will create some ipvs rules as well.  Because there is no way to tell if a given
		// ipvs rule is created by IPVS proxier or not.  Users should explicitly specify `--clean-ipvs=true` to flush
		// all ipvs rules when kube-proxy start up.  Users do this operation should be with caution.
		if canUseIPVS {
			ipvs.CleanupLeftovers(ipvsInterface, iptInterface, ipsetInterface, cleanupIPVS)
		}
	}

  3.4 模式為IPVS情況

else if proxyMode == proxyModeIPVS {
		glog.V(0).Info("Using ipvs Proxier.")
		proxierIPVS, err := ipvs.NewProxier(
			iptInterface,
			ipvsInterface,
			ipsetInterface,
			utilsysctl.New(),
			execer,
			config.IPVS.SyncPeriod.Duration,
			config.IPVS.MinSyncPeriod.Duration,
			config.IPVS.ExcludeCIDRs,
			config.IPTables.MasqueradeAll,
			int(*config.IPTables.MasqueradeBit),
			config.ClusterCIDR,
			hostname,
			nodeIP,
			recorder,
			healthzServer,
			config.IPVS.Scheduler,
			config.NodePortAddresses,
		)
		if err != nil {
			return nil, fmt.Errorf("unable to create proxier: %v", err)
		}
		metrics.RegisterMetrics()
		proxier = proxierIPVS
		serviceEventHandler = proxierIPVS
		endpointsEventHandler = proxierIPVS
		glog.V(0).Info("Tearing down inactive rules.")
		// TODO this has side effects that should only happen when Run() is invoked.
		userspace.CleanupLeftovers(iptInterface)
		iptables.CleanupLeftovers(iptInterface)
	}

  3.5 模式為userspace

else {
		glog.V(0).Info("Using userspace Proxier.")
		// This is a proxy.LoadBalancer which NewProxier needs but has methods we don't need for
		// our config.EndpointsConfigHandler.
		loadBalancer := userspace.NewLoadBalancerRR()
		// set EndpointsConfigHandler to our loadBalancer
		endpointsEventHandler = loadBalancer

		// TODO this has side effects that should only happen when Run() is invoked.
		proxierUserspace, err := userspace.NewProxier(
			loadBalancer,
			net.ParseIP(config.BindAddress),
			iptInterface,
			execer,
			*utilnet.ParsePortRangeOrDie(config.PortRange),
			config.IPTables.SyncPeriod.Duration,
			config.IPTables.MinSyncPeriod.Duration,
			config.UDPIdleTimeout.Duration,
			config.NodePortAddresses,
		)
		if err != nil {
			return nil, fmt.Errorf("unable to create proxier: %v", err)
		}
		serviceEventHandler = proxierUserspace
		proxier = proxierUserspace

		// Remove artifacts from the iptables and ipvs Proxier, if not on Windows.
		glog.V(0).Info("Tearing down inactive rules.")
		// TODO this has side effects that should only happen when Run() is invoked.
		iptables.CleanupLeftovers(iptInterface)
		// IPVS Proxier will generate some iptables rules, need to clean them before switching to other proxy mode.
		// Besides, ipvs proxier will create some ipvs rules as well.  Because there is no way to tell if a given
		// ipvs rule is created by IPVS proxier or not.  Users should explicitly specify `--clean-ipvs=true` to flush
		// all ipvs rules when kube-proxy start up.  Users do this operation should be with caution.
		if canUseIPVS {
			ipvs.CleanupLeftovers(ipvsInterface, iptInterface, ipsetInterface, cleanupIPVS)
		}
	}

4. Run函式

  路徑cmd/kube-proxy/app/server.go

  4.1 設定oom數值

    設定/proc/self/oom_score_adj值為-999

	// TODO(vmarmol): Use container config for this.
	var oomAdjuster *oom.OOMAdjuster
	if s.OOMScoreAdj != nil {
		oomAdjuster = oom.NewOOMAdjuster()
		if err := oomAdjuster.ApplyOOMScoreAdj(0, int(*s.OOMScoreAdj)); err != nil {
			glog.V(2).Info(err)
		}
	}

  4.2 設定連線跟蹤

  • 設定sysctl 'net/netfilter/nf_conntrack_max' 值為 524288  
  • 設定sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' 值為 86400
  • 設定sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait'  值為 3600
	// Tune conntrack, if requested
	// Conntracker is always nil for windows
	if s.Conntracker != nil {
		max, err := getConntrackMax(s.ConntrackConfiguration)
		if err != nil {
			return err
		}
		if max > 0 {
			err := s.Conntracker.SetMax(max)
			if err != nil {
				if err != readOnlySysFSError {
					return err
				}
				// readOnlySysFSError is caused by a known docker issue (https://github.com/docker/docker/issues/24000),
				// the only remediation we know is to restart the docker daemon.
				// Here we'll send an node event with specific reason and message, the
				// administrator should decide whether and how to handle this issue,
				// whether to drain the node and restart docker.
				// TODO(random-liu): Remove this when the docker bug is fixed.
				const message = "DOCKER RESTART NEEDED (docker issue #24000): /sys is read-only: " +
					"cannot modify conntrack limits, problems may arise later."
				s.Recorder.Eventf(s.NodeRef, api.EventTypeWarning, err.Error(), message)
			}
		}

		if s.ConntrackConfiguration.TCPEstablishedTimeout != nil && s.ConntrackConfiguration.TCPEstablishedTimeout.Duration > 0 {
			timeout := int(s.ConntrackConfiguration.TCPEstablishedTimeout.Duration / time.Second)
			if err := s.Conntracker.SetTCPEstablishedTimeout(timeout); err != nil {
				return err
			}
		}

		if s.ConntrackConfiguration.TCPCloseWaitTimeout != nil && s.ConntrackConfiguration.TCPCloseWaitTimeout.Duration > 0 {
			timeout := int(s.ConntrackConfiguration.TCPCloseWaitTimeout.Duration / time.Second)
			if err := s.Conntracker.SetTCPCloseWaitTimeout(timeout); err != nil {
				return err
			}
		}
	}

  4.3 這個是proxy的重點

    配置service和endpoint,間接watch etcd,更新service和endpoint關係

    第5章節講解service,endpoint同理略過

	// Create configs (i.e. Watches for Services and Endpoints)
	// Note: RegisterHandler() calls need to happen before creation of Sources because sources
	// only notify on changes, and the initial update (on process start) may be lost if no handlers
	// are registered yet.
	serviceConfig := config.NewServiceConfig(informerFactory.Core().V1().Services(), s.ConfigSyncPeriod)
	serviceConfig.RegisterEventHandler(s.ServiceEventHandler)
	go serviceConfig.Run(wait.NeverStop)

	endpointsConfig := config.NewEndpointsConfig(informerFactory.Core().V1().Endpoints(), s.ConfigSyncPeriod)
	endpointsConfig.RegisterEventHandler(s.EndpointsEventHandler)
	go endpointsConfig.Run(wait.NeverStop)

  4.4 SyncLoop函式

    一看熟悉的套路,一猜就知道無限迴圈進行工作,主要分為iptables,ipvs,userspace等,第6章節講解

	// Just loop forever for now...
	s.Proxier.SyncLoop()

5. 建立ServiceConfig結構體,註冊informer包括回撥函式

  • handleAddService
  • handleUpdateService
  • handleDeleteService
  負責watch service的建立,更新,刪除操作
// NewServiceConfig creates a new ServiceConfig.
func NewServiceConfig(serviceInformer coreinformers.ServiceInformer, resyncPeriod time.Duration) *ServiceConfig {
	result := &ServiceConfig{
		lister:       serviceInformer.Lister(),
		listerSynced: serviceInformer.Informer().HasSynced,
	}

	serviceInformer.Informer().AddEventHandlerWithResyncPeriod(
		cache.ResourceEventHandlerFuncs{
			AddFunc:    result.handleAddService,
			UpdateFunc: result.handleUpdateService,
			DeleteFunc: result.handleDeleteService,
		},
		resyncPeriod,
	)

	return result
}

  5.1  Run函式

    主要是與cache進行同步

// Run starts the goroutine responsible for calling
// registered handlers.
func (c *ServiceConfig) Run(stopCh <-chan struct{}) {
	defer utilruntime.HandleCrash()

	glog.Info("Starting service config controller")
	defer glog.Info("Shutting down service config controller")

	if !controller.WaitForCacheSync("service config", stopCh, c.listerSynced) {
		return
	}

	for i := range c.eventHandlers {
		glog.V(3).Infof("Calling handler.OnServiceSynced()")
		c.eventHandlers[i].OnServiceSynced()
	}

	<-stopCh
}

6. SyncLoop函式

  路徑pkg/proxy/iptables/proxier.go

  根據註冊的函式 proxier.syncRunner = async.NewBoundedFrequencyRunner("sync-runner", proxier.syncProxyRules, minSyncPeriod, syncPeriod, burstSyncs) 跟吧,跟吧進入到syncProxyRules函式

// SyncLoop runs periodic work.  This is expected to run as a goroutine or as the main loop of the app.  It does not return.
func (proxier *Proxier) SyncLoop() {
	// Update healthz timestamp at beginning in case Sync() never succeeds.
	if proxier.healthzServer != nil {
		proxier.healthzServer.UpdateTimestamp()
	}
	proxier.syncRunner.Loop(wait.NeverStop)
}

7. syncProxyRules函式

   特麼長,用腳後跟想想(哈哈哈)就是更新iptables規則,包括錶鏈啥的

   Kubernetes可以利用iptables來做針對service的路由和負載均衡,其核心邏輯是通過kubernetes/pkg/proxy/iptables/proxier.go中的函式syncProxyRules來實現的

  7.1 跟新service endpoint

	// We assume that if this was called, we really want to sync them,
	// even if nothing changed in the meantime. In other words, callers are
	// responsible for detecting no-op changes and not calling this function.
	serviceUpdateResult := proxy.UpdateServiceMap(proxier.serviceMap, proxier.serviceChanges)
	endpointUpdateResult := proxy.UpdateEndpointsMap(proxier.endpointsMap, proxier.endpointsChanges)

  7.2 執行命令,新增chain以及rule

	// Create and link the kube chains.
	for _, chain := range iptablesJumpChains {
		if _, err := proxier.iptables.EnsureChain(chain.table, chain.chain); err != nil {
			glog.Errorf("Failed to ensure that %s chain %s exists: %v", chain.table, kubeServicesChain, err)
			return
		}
		args := append(chain.extraArgs,
			"-m", "comment", "--comment", chain.comment,
			"-j", string(chain.chain),
		)
		if _, err := proxier.iptables.EnsureRule(utiliptables.Prepend, chain.table, chain.sourceChain, args...); err != nil {
			glog.Errorf("Failed to ensure that %s chain %s jumps to %s: %v", chain.table, chain.sourceChain, chain.chain, err)
			return
		}
	}

KUBE-EXTERNAL-SERVICES   -t   filter

KUBE-SERVICES                          -t  filter

KUBE-SERVICES                          -t  nat

KUBE-POSTROUTING                 -t  filter

KUBE-FORWARD                          -t  filter

  7.3 執行命令iptables-save

  • iptables-save [-t filter]
  • iptables-save [-t nat]
	//
	// Below this point we will not return until we try to write the iptables rules.
	//

	// Get iptables-save output so we can check for existing chains and rules.
	// This will be a map of chain name to chain with rules as stored in iptables-save/iptables-restore
	existingFilterChains := make(map[utiliptables.Chain][]byte)
	proxier.existingFilterChainsData.Reset()
	err := proxier.iptables.SaveInto(utiliptables.TableFilter, proxier.existingFilterChainsData)
	if err != nil { // if we failed to get any rules
		glog.Errorf("Failed to execute iptables-save, syncing all rules: %v", err)
	} else { // otherwise parse the output
		existingFilterChains = utiliptables.GetChainLines(utiliptables.TableFilter, proxier.existingFilterChainsData.Bytes())
	}

	// IMPORTANT: existingNATChains may share memory with proxier.iptablesData.
	existingNATChains := make(map[utiliptables.Chain][]byte)
	proxier.iptablesData.Reset()
	err = proxier.iptables.SaveInto(utiliptables.TableNAT, proxier.iptablesData)
	if err != nil { // if we failed to get any rules
		glog.Errorf("Failed to execute iptables-save, syncing all rules: %v", err)
	} else { // otherwise parse the output
		existingNATChains = utiliptables.GetChainLines(utiliptables.TableNAT, proxier.iptablesData.Bytes())
	}

# iptables-save | grep pool4-sc-site

-A KUBE-NODEPORTS -p tcp -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -m tcp --dport 64264 -j KUBE-MARK-MASQ

-A KUBE-NODEPORTS -p tcp -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -m tcp --dport 64264 -j KUBE-SVC-SATU2HUKOORIDRPW

-A KUBE-SEP-RRSJPMBBE7GG2XAM -s 192.168.75.114/32 -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -j KUBE-MARK-MASQ

-A KUBE-SEP-RRSJPMBBE7GG2XAM -p tcp -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -m tcp -j DNAT --to-destination 192.168.75.114:20881

-A KUBE-SEP-WVYWW757ZWPZ7HKT -s 192.168.181.42/32 -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -j KUBE-MARK-MASQ

-A KUBE-SEP-WVYWW757ZWPZ7HKT -p tcp -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -m tcp -j DNAT --to-destination 192.168.181.42:20881

-A KUBE-SERVICES -d 10.254.59.218/32 -p tcp -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site cluster IP" -m tcp --dport 20881 -j KUBE-SVC-SATU2HUKOORIDRPW

-A KUBE-SVC-SATU2HUKOORIDRPW -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-WVYWW757ZWPZ7HKT

-A KUBE-SVC-SATU2HUKOORIDRPW -m comment --comment "ns-dfbc21ef/pool4-sc-site:pool4-sc-site" -j KUBE-SEP-RRSJPMBBE7GG2XAM

1. 對每一個服務,在nat表中建立名為“KUBE-SVC-XXXXXXXXXXXXXXXX”的自定義鏈

    所有流經自定義鏈KUBE-SERVICES的來自於服務“ns-dfbc21ef/pool4-sc-site”的資料包都會跳轉到自定義鏈KUBE-SVC-SATU2HUKOORIDRPW中

2. 檢查該服務是否啟用了nodeports

    所有流經自定義鏈KUBE-NODEPORTS的來自於服務“ns-dfbc21ef/pool4-sc-site”的資料包都會跳轉到自定義鏈KUBE-MARK-MASQ中,即kubernetes會對來自上述服務的這些資料包打一個標記(0x4000/0x4000)

    所有流經自定義鏈KUBE-NODEPORTS的來自於服務“ns-dfbc21ef/pool4-sc-site”的資料包都會跳轉到自定義鏈KUBE-SVC-SATU2HUKOORIDRPW中

3. 對每一個服務,如果這個服務有對應的endpoints,那麼在nat表中建立名為“KUBE-SEP-XXXXXXXXXXXXXXXX”的自定義鏈

     所有流經自定義鏈KUBE-SVC-VX5XTMYNLWGXYEL4的來自於服務“ym/echo-app”的資料//包既可能會跳轉到自定義鏈KUBE-SEP-27OZWHQEIJ47W5ZW,也可能會跳轉到自定義

相關推薦

kubernetes/k8s原始碼分析kube proxy原始碼分析

本文再次於2018年11月15日再次編輯,基於1.12版本,包括IPVS 序言 kube-proxy管理sevice的Endpoints,service對外暴露一個Virtual IP(Cluster IP), 叢集內Cluster IP:Port就能訪問到叢集內對應

kubernetes/k8s原始碼分析kube-scheduler 原始碼分析

前言 在 kubernetes 體系中,scheduler 是唯一一個以 plugin 形式存在的模組,這種可插拔的設計方便使用者自定義所需要的排程演算法,所以原始碼路徑為 plugin 目錄下

kubernetes/k8s原始碼分析kube-dns 原始碼解析之kubedns

https://github.com/kubernetes/dns結構和kubernetes的程式碼結構類似:首先看cmd/kube-dns/dns.go啟動引數例子: /kube-dns --doma

kubernetes/k8s原始碼分析kube-apiserver的go-restful框架使用

go-restful框架     github: https://github.com/emicklei/go-restful 三個重要資料結構   1. 初始化   路徑pkg/kubelet/kubelet.go中函式Ne

kubernetes/k8s原始碼分析kubectl-proxy ipvs原始碼分析

kubernetes版本: 1.12.1  原始碼路徑 pkg/proxy/ipvs/proxier.go 本文只講解IPVS相關部分,啟動流程前文: https://blog.csdn.net/zhonglinzhang/article/details/80185053

kubernetes/k8s概念CNI host-local原始碼分析

接著上章節假設host-local成功分配IP,這章節講解host-local 原始碼地址: https://github.com/containernetworking/plugins 引數 { "name": "macvlannet", "cniVers

kubernetes/k8s概念CNI macvlan原始碼分析

macvlan原理      在linux命令列執行 lsmod | grep macvlan 檢視當前核心是否載入了該driver;如果沒有檢視到,可以通過 modprobe macvlan 來載入  &n

kubernetes/k8s原始碼分析kubelet原始碼分析之cdvisor原始碼分析

  資料流 UnsecuredDependencies -> run   1. cadvisor.New初始化 if kubeDeps.CAdvisorInterface == nil { imageFsInfoProvider := cadv

kubernetes/k8s原始碼分析kubelet原始碼分析之容器網路初始化原始碼分析

一. 網路基礎   1.1 網路名稱空間的操作 建立網路名稱空間: ip netns add 名稱空間內執行命令: ip netns exec 進入名稱空間: ip netns exec bash   1.2 bridge-nf-c

kubernetes/k8s原始碼分析kubelet原始碼分析之資源上報

0. 資料流   路徑: pkg/kubelet/kubelet.go   Run函式() ->   syncNodeStatus ()  ->   registerWithAPIServer() ->

kubernetes/k8s原始碼分析kubelet原始碼分析之啟動容器

主要是呼叫runtime,這裡預設為docker 0. 資料流 NewMainKubelet(cmd/kubelet/app/server.go) -> NewKubeGenericRuntimeManager(pkg/kubelet/kuberuntime/kuberuntime

kubernetes/k8s原始碼分析 controller-manager之replicaset原始碼分析

ReplicaSet簡介     Kubernetes 中建議使用 ReplicaSet來取代 ReplicationController。ReplicaSet 跟 ReplicationController 沒有本質的不同, ReplicaSet 支援集合式的

kubernetes/k8s原始碼分析 client-go包之Informer原始碼分析

Informer 簡介        Informer 是 Client-go 中的一個核心工具包。如果 Kubernetes 的某個元件,需要 List/Get Kubernetes 中的 Object(包括pod,service等等),可以直接使用

kubernetes/k8s原始碼分析 deployment原始碼分析

0. 開始 func NewControllerInitializers() map[string]InitFunc { controllers := map[string]InitFunc{}

kubernetes/k8s概念CNI plugin calico原始碼分析

       calico解決不同物理機上容器之間的通訊,而calico-plugin是在k8s建立Pod時為Pod設定虛擬網絡卡(容器中的eth0和lo網絡卡),calico-plugin是由兩個靜態的二進位制檔案組成,由kubelet以命令列的形式呼叫,這兩個二進位制

kubernetes/k8s原始碼分析kubernetes event原始碼分析

描述         使用方式 eventBroadcaster := record.NewBroadcaster() eventBroadcaster.StartLogging(glog.Infof) eventBroadcaster

kubernetes/k8s原始碼分析kubectl-controller-manager之job原始碼分析

job介紹     Job: 批量一次性任務,並保證處理的一個或者多個Pod成功結束 非並行Job: 固定完成次數的並行Job: 帶有工作佇列的並行Job: SPEC引數 .spec.completions: 

kubernetes/k8s原始碼分析kubectl-controller-manager之cronjob原始碼分析

crontab的基本格式     支援 , - * / 四個字元           *:表示匹配任意值,如果在Minutes 中使用表示每分鐘    &

kubernetes/k8s原始碼分析kubectl-controller-manager之HPA原始碼分析

本文基於kubernetes版本:v1.12.1 HPA介紹      https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/     Th

kubernetes/k8s原始碼分析kubectl-controller-manager之pod gc原始碼分析

  引數:    --controllers strings:配置需要enable的列表         這裡也包括podgc          All con