Python Subprocess Popen 管道阻塞問題分析解決

使用Subprocess Popen的類庫困撓了我一個月的問題終於解決了。

一句話就是：等待命令返回不要使用 wait() ，而是使用 communicate() ，但註意內存，大輸出使用文件。

錯誤的使用例子

之前的代碼這樣使用的。


# 不合適的代碼
def run_it(self, cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True,
                         stderr=subprocess.PIPE, close_fds=True)
    log.debug('running:%s' % cmd)
    p.wait()
    if p.returncode != 0:
        log.critical("Non zero exit code:%s executing: %s" % (p.returncode, cmd))
    return p.stdout

這段代碼之前用著一直沒有問題的，後來不知道為何就不能用了（後面知道了，原來輸出內容增加，輸出的問題本太長，把管道給堵塞了）。

這樣的代碼也在之前的一個項目中使用，而且調用的次數有上億次，也沒什麽問題。之前倒是也卡住了一次，不過有個大神把問題找到了，因為python版本低於2.7.6，Python對 close_fds 的一些實現不太好導致的，沒有把管道釋放掉，一直卡住。設置 close_fds=True 。不過這個並沒有解決我的問題。

解決了我的問題

當時想著既然卡住了，那我就看看是輸出了什麽才卡住的，結果現有的代碼無法支持我的想法，就換了代碼，沒想到就不卡住了。


def run_it(cmd):
    # _PIPE = subprocess.PIPE
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True,
                         stderr=subprocess.PIPE) #, close_fds=True)

    log.debug('running:%s' % cmd)
    out, err = p.communicate()
    log.debg(out)
    if p.returncode != 0:
        log.critical("Non zero exit code:%s executing: %s" % (p.returncode, cmd))
    return p.stdout

看看Python文檔信息

Warning

Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

Popen.wait()
    Wait for child process to terminate. Set and return returncode attribute.

    Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

Popen.communicate(input=None)
    Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional input argument should be a string to be sent to the child process, or None, if no data should be sent to the child.

    communicate() returns a tuple (stdoutdata, stderrdata).

    Note that if you want to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other than None in the result tuple, you need to give stdout=PIPE and/or stderr=PIPE too.

    Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

之前沒註意，再細看一下文檔，感覺豁然開朗。

linux管道限制，為什麽會阻塞呢？

下面來看看 Can someone explain pipe buffer deadlock? 的回答。

子進程產生一些數據，他們會被buffer起來，當buffer滿了，會寫到子進程的標準輸出和標準錯誤輸出，這些東西通過管道發送給父進程。當管道滿了之後，子進程就停止寫入，於是就卡住了。

及時取走管道的輸出也沒有問題


# 及時從管道中取走數據
def run_it(self, cmd):
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True,
                         stderr=subprocess.PIPE, close_fds=True)
    log.debug('running:%s' % cmd)
    for line in iter(p.stdout.readline, b''):
        print line,          # print to stdout immediately
    p.stdout.close()
    p.wait()
    if p.returncode != 0:
        log.critical("Non zero exit code:%s executing: %s" % (p.returncode, cmd))
    return p.stdout

看了Python的 communicate() 內部就是將stdout/stderr讀取出來到一個list變量中的，最後函數結束時返回。

測試Linux管道阻塞問題

看到別人的例子，一直在想怎麽測試輸出64K的數據，發現dd這個思路很棒，是 見過最優雅的例子了，精確控制輸出的長度 ，其他都是從某些地方搞來大文件導入進來。


#!/usr/bin/env python
# coding: utf-8
# yc@2013/04/28

import subprocess

def test(size):
    print 'start'

    cmd = 'dd if=/dev/urandom bs=1 count=%d 2>/dev/null' % size
    p = subprocess.Popen(args=cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
    #p.communicate()
    p.wait()  # 這裏超出管道限制，將會卡住子進程

    print 'end'

# 64KB
test(64 * 1024)

# 64KB + 1B
test(64 * 1024 + 1)

# output :
start
end
start   #  然後就阻塞了。

首先測試輸出為 64KB 大小的情況。使用 dd 產生了正好 64KB 的標準輸出，由 subprocess.Popen 調用，然後使用 wait() 等待 dd 調用結束。可以看到正確的 start 和 end 輸出；然後測試比 64KB 多的情況，這種情況下只輸出了 start，也就是說程序執行卡在了 p.wait() 上，程序死鎖。

總結

那死鎖問題如何避免呢？官方文檔裏推薦使用 Popen.communicate() 。這個方法會把輸出放在內存，而不是管道裏，所以這時候上限就和內存大小有關了，一般不會有問題。而且如果要獲得程序返回值，可以在調用 Popen.communicate() 之後取 Popen.returncode 的值。

但真的如果超過內存了，那麽要考慮比如文件 stdout=open("process.out", "w") 的方式來解決了，不能使用管道了。

另外說一下。管道的要用清楚，不要隨意的亂世用管道。比如沒有input的時候，那麽stdin就不要用管道了。

還有不要把簡單的事情復雜化。比如 echo 1 > /sys/linux/xxx 修改文件，這麽簡單的功能就不要用Linux的shell調用了，使用Python自帶的 open('file', 'w').write('1') 。盡量保持Python範。

參考

https://thraxil.org/users/anders/posts/2008/03/13/Subprocess-Hanging-PIPE-is-your-enemy/
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate
http://blog.csdn.net/carolzhang8406/article/details/22286913
http://www.cnblogs.com/icejoywoo/p/3627397.html

原創文章，轉載請註明： 轉載自東東東陳煜東的博客

本文鏈接地址: Python Subprocess Popen 管道阻塞問題分析解決 – https://www.chenyudong.com/archives/python-subprocess-popen-block.html

Tags: running return 管道而且項目

文章來源：