1. 程式人生 > >Namespaces and Go Part 3

Namespaces and Go Part 3

In Part 2 we executed a shell with modified hostname using UTS namespace.

In this article, we will explain how we can use PID and Mount namespaces.

By isolating mount and process namespaces, we will be able to give the container the impression that it’s having exclusive access to /proc.
Which will make all process related commands to fetch information from isolated /proc.

Let’s see how we can implement this in go.

We can modify ‘startContainer’ function discussed in Part 2 for mounting proc file system in MNT namespace.

if err := syscall.Mount("proc", filepath.Join(root, "/proc"), "proc", uintptr(syscall.MS_NOEXEC|syscall.MS_NOSUID|syscall.MS_NODEV), ""); err != nil {
                fmt.Println("Proc mount failed")
        }

As per mount man page, the flags mentioned with syscall.Mount is below

http://man7.org/linux/man-pages/man2/mount.2.html
MS_NOEXEC	 8 /* Disallow program execution */

MS_NOSUID	 2 /* Ignore suid and sgid bits */

MS_NODEV	 4 /* Disallow access to device special files */

This will make sure the container won’t get any special access via /proc.

To get the mount to work, we need to add namespace syscall.CLONE_NEWNS to Cloneflags in ‘setNameSpaces‘ function.
Along with that, we will add one more flag syscall.CLONE_NEWPID so that we will get isolated PIDs and process related commands will give isolated details because of mounted /proc.

So let’s combine all the above codes and build a modified program.

package main

import (
        "fmt"
        "os"
        "os/exec"
        "syscall"
)

//Function to change Arg[0] to "fork" and set namespaces
//Finally call the binary itself  **
func setNameSpaces(shell string) {
        cmd := &exec.Cmd{
                Path: os.Args[0],
                Args: append([]string{"fork"}, shell),
        }
        cmd.Stdin = os.Stdin
        cmd.Stdout = os.Stdout
        cmd.Stderr = os.Stderr
        cmd.SysProcAttr = &syscall.SysProcAttr{
                Cloneflags: syscall.CLONE_NEWUSER |
                        syscall.CLONE_NEWPID |
                        syscall.CLONE_NEWNS |
                        syscall.CLONE_NEWUTS,
                UidMappings: []syscall.SysProcIDMap{
                        {
                                ContainerID: 0,
                                HostID:      os.Getuid(),
                                Size:        1,
                        },
                },
                GidMappings: []syscall.SysProcIDMap{
                        {
                                ContainerID: 0,
                                HostID:      os.Getgid(),
                                Size:        1,
                        },
                },
        }
        cmd.Run() // path=./executable , Args[0]=fork , Args[1]=bash
}

//Set new hostname in already initialized namespace and start a shell
func startContainer(shell string) {
        fmt.Println("Starting Container")
        //Set a new hostname for our container
        if err := syscall.Sethostname([]byte("container")); err != nil {
                fmt.Printf("Setting Hostname failed")
        }
        //Mount /proc
        //Mount /proc to new root's  proc directory using MNT namespace
        if err := syscall.Mount("proc", "/proc", "proc", uintptr(syscall.MS_NOEXEC|syscall.MS_NOSUID|syscall.MS_NODEV), ""); err != nil {
                fmt.Println("Proc mount failed")
        }

        if err := syscall.Exec(shell, []string{""}, os.Environ()); err != nil {
                fmt.Println("Exec failed")
        }

}

func main() {
        // Get absolute path to bash
        shell, err := exec.LookPath("bash")
        if err != nil {
                fmt.Printf("Bash not found\n")
                os.Exit(1)
        }
        //This condition will fail first time as the Args[0] will be the name of program
        //But this condition will become true when ,
        //this program itslef calls with Arg[0] = "fork" from startProc() **
        if os.Args[0] == "fork" {
                startContainer(shell)
                os.Exit(0)
        }
        //Starting point
        setNameSpaces(shell)
}
[email protected]:~/.../uts_demo> go build uts_mnt_demo.go
[email protected]:~/.../uts_demo> ./uts_mnt_demo
Starting Container
container:~/.../uts_demo # ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  1 23:47 pts/1    00:00:00
root        23     1  0 23:47 pts/1    00:00:00 ps -ef
container:~/.../uts_demo # cd /proc/
container:/proc # df .
Filesystem     1K-blocks  Used Available Use% Mounted on
proc                   0     0         0    - /proc
container:/proc # echo $$
1
container:/proc # exit
exit

In the next article, we will change the root file system of the container so that we can completely isolate the process.
This will be again be done with mount namespace, but we will use busybox and pivot_root operation to accomplish more isolation and ‘container-ish’ feel.