【原創】從原始碼剖析IO流（三）快取流--轉載請註明出處

阿新 • • 發佈：2018-12-24

一、BufferedInputStream

關於BufferedInputStream，首先我們要看一下，官方給予的對於BufferedInputStream這個類的備註：

/**
 * A <code>BufferedInputStream</code> adds
 * functionality to another input stream-namely,
 * the ability to buffer the input and to
 * support the <code>mark</code> and <code>reset</code>
 * methods. When  the <code>BufferedInputStream</code>
 * is created, an internal buffer array is
 * created. As bytes  from the stream are read
 * or skipped, the internal buffer is refilled
 * as necessary  from the contained input stream,
 * many bytes at a time. The <code>mark</code>
 * operation  remembers a point in the input
 * stream and the <code>reset</code> operation
 * causes all the  bytes read since the most
 * recent <code>mark</code> operation to be
 * reread before new bytes are  taken from
 * the contained input stream.
 *
 * @author  Arthur van Hoff
 * @since   JDK1.0
 */

這段備註的翻譯為：

對另一個輸入流的功能，即緩衝輸入和支援<程式碼>標記< /程式碼>和<程式碼>重置< /程式碼>方法。當建立<程式碼> BufferedInputStream</Cuff>時，建立一個內部緩衝區陣列。當從流讀取位元組或跳過位元組時，內部緩衝區根據需要從所包含的輸入流中重新填充，一次多個位元組。<程式碼>標記< /程式碼>操作記住輸入流中的一個點，而<程式碼>重置< /程式碼>操作會導致自重新開始以來讀取的所有位元組。在從包含的輸入流中獲取新位元組之前，應重新分配“程式碼>程式碼>標記<程式碼>操作。

要注意上面的標紅的部分，BufferedInputStream和InputStream的主要區別，便是在於BufferedInputStream在一開始使用的時候，會進行初始化一個數組緩衝區，在每次進行操作時，均是從該緩衝區中進行讀取內容。在快取輸入流中，主要具有以下的成員屬性：

//緩衝區預設的預設大小
    private static int DEFAULT_BUFFER_SIZE = 8192;

    /**
     * 分派給arrays的最大容量
     * 為什麼要減去8呢？
     * 因為某些VM會在陣列中保留一些頭字，嘗試分配這個最大儲存容量，
     * 可能會導致array容量大於VM的limit，最終導致OutOfMemoryError。
     */
    private static int MAX_BUFFER_SIZE = Integer.MAX_VALUE - 8;

    /**
     * 存放資料的內部緩衝陣列。
     * 當有必要時，可能會被另一個不同容量的陣列替代。
     */
    protected volatile byte buf[];

    /**
     * 為緩衝區提供compareAndSet的原子更新器。
     * 這是很有必要的，因為關閉操作可以使非同步的。我們使用非空的緩衝區陣列作為流被關閉的指示器。
     * 該成員變數與buf陣列的volatile關鍵字共同作用，實現了當在多執行緒環境中操作BufferedInputStream物件時，buf和bufUpdater都具有原子性。
     */
    private static final
        AtomicReferenceFieldUpdater<BufferedInputStream, byte[]> bufUpdater =
        AtomicReferenceFieldUpdater.newUpdater
        (BufferedInputStream.class,  byte[].class, "buf");

    /**
     * 緩衝區中的位元組數。
     */
    protected int count;

    /**
     * 緩衝區當前位置的索引
     */
    protected int pos;

    /**
     * 最後一次呼叫mark方法時pos欄位的值。
     */
    protected int markpos = -1;

    /**
     * 呼叫mark方法後，在後續呼叫reset方法失敗之前所允許的最大提前讀取量。
     * markpos的最大值
     */
    protected int marklimit;

然後再來看一下快取輸入流的構造器：

    public BufferedInputStream(InputStream in) {
        this(in, DEFAULT_BUFFER_SIZE);
    }

    public BufferedInputStream(InputStream in, int size) {
        super(in);
        if (size <= 0) {
            throw new IllegalArgumentException("Buffer size <= 0");
        }
        buf = new byte[size];
    }

這裡在進行初始化時，可以使用一個預設的，或者自定義的長度來定義一個緩衝區位元組陣列的大小。但是在進行構造器構造時，並沒有對緩衝區的內容進行初始化操作。此時，我們就需要來看到BufferedInputStream中的read，read1和fill三個方法：

    public synchronized int read(byte b[], int off, int len) throws IOException
    {
        getBufIfOpen(); // Check for closed stream
        if ((off | len | (off + len) | (b.length - (off + len))) < 0) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return 0;
        }

        int n = 0;
        for (;;) {
            int nread = read1(b, off + n, len - n);
            if (nread <= 0)
                return (n == 0) ? nread : n;
            n += nread;
            if (n >= len)
                return n;
            // if not closed but no bytes available, return
            InputStream input = in;
            if (input != null && input.available() <= 0)
                return n;
        }
    }

    private int read1(byte[] b, int off, int len) throws IOException {
        int avail = count - pos;
        if (avail <= 0) {
            /* If the requested length is at least as large as the buffer, and
               if there is no mark/reset activity, do not bother to copy the
               bytes into the local buffer.  In this way buffered streams will
               cascade harmlessly. */
            if (len >= getBufIfOpen().length && markpos < 0) {
                return getInIfOpen().read(b, off, len);
            }
            fill();
            avail = count - pos;
            if (avail <= 0) return -1;
        }
        int cnt = (avail < len) ? avail : len;
        System.arraycopy(getBufIfOpen(), pos, b, off, cnt);
        pos += cnt;
        return cnt;
    }

    private void fill() throws IOException {
        byte[] buffer = getBufIfOpen();
        if (markpos < 0)
            pos = 0;            /* no mark: throw away the buffer */
        else if (pos >= buffer.length)  /* no room left in buffer */
            if (markpos > 0) {  /* can throw away early part of the buffer */
                int sz = pos - markpos;
                System.arraycopy(buffer, markpos, buffer, 0, sz);
                pos = sz;
                markpos = 0;
            } else if (buffer.length >= marklimit) {
                markpos = -1;   /* buffer got too big, invalidate mark */
                pos = 0;        /* drop buffer contents */
            } else if (buffer.length >= MAX_BUFFER_SIZE) {
                throw new OutOfMemoryError("Required array size too large");
            } else {            /* grow buffer */
                int nsz = (pos <= MAX_BUFFER_SIZE - pos) ?
                        pos * 2 : MAX_BUFFER_SIZE;
                if (nsz > marklimit)
                    nsz = marklimit;
                byte nbuf[] = new byte[nsz];
                System.arraycopy(buffer, 0, nbuf, 0, pos);
                if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
                    // Can't replace buf if there was an async close.
                    // Note: This would need to be changed if fill()
                    // is ever made accessible to multiple threads.
                    // But for now, the only way CAS can fail is via close.
                    // assert buf == null;
                    throw new IOException("Stream closed");
                }
                buffer = nbuf;
            }
        count = pos;
        int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
        if (n > 0)
            count = n + pos;
    }

首先來說一下這三個方法的作用分別如下：

read：直接讀取所需要的位元組，然後組裝進引數b中。如果沒有讀取到足夠的位元組，流中依然存在資料可以讀取，則再次呼叫read1方法進行讀取。

read1：直接讀取所需要的位元組，然後組裝進引數b中，如果當前的快取區陣列中的資料已經被全部讀取完畢了，則重新呼叫fill方法進行讀取新的緩衝區內容。

fill：讀取檔案中指定長度的內容到緩衝區中。

由此，可以畫出下方的流程圖：

最後再來看一下Close方法中的操作：

close方法中，會對所持有的InputStream物件進行關閉，然後對內部存在的緩衝區的物件進行清除。

    public void close() throws IOException {
        byte[] buffer;
        while ( (buffer = buf) != null) {
            if (bufUpdater.compareAndSet(this, buffer, null)) {
                InputStream input = in;
                in = null;
                if (input != null)
                    input.close();
                return;
            }
            // Else retry in case a new buf was CASed in fill()
        }
    }

二、BufferedInputStream總結：

快取輸入流，本身是以一個byte型別的陣列作為快取容器，優先將輸入流中的內容讀取到快取容器中，然後再從容器中輸出到外部來供使用者使用的類。利用的是輸入流在一次讀取較多位元組時，效率高於多次讀取較少位元組來的特點來設計的。以此降低從輸入流中讀取位元組的頻率，以達到對於長內容，在較短讀取位元組長度下快速讀取的功能。對於該功能我們寫出以下的用例來進行測試：

    public static void main(String args[]) throws Exception{
        {
            long startTime = System.currentTimeMillis();
            for (int i = 1; i <= 100; i++) {
                FileInputStream fis = new FileInputStream("E:/testFile/test.txt");
                for (byte[] b = new byte[1000]; fis.read(b, 0, 1000) > 0; ) {
                }
                fis.close();
            }
            long endTime = System.currentTimeMillis();
            System.out.println("每次讀取1000位元組用時：" + (endTime - startTime));
        }
        {
            long startTime = System.currentTimeMillis();
            for (int i = 1 ; i <= 100 ; i ++) {
                FileInputStream fis = new FileInputStream("E:/testFile/test.txt");
                for (byte[] b = new byte[1000] ; fis.read(b, 0, 1) > 0 ; ) {}
                fis.close();
            }
            long endTime = System.currentTimeMillis();
            System.out.println("每次讀取1位元組用時：" + (endTime - startTime));
        }
    }

執行的結果為：

每次讀取1000位元組用時：4
每次讀取1位元組用時：50

三、BufferedOutputStream

在學習完了BufferedInputStream之後，再來看BufferedOutPutStream就能很好的理解這個流的內容了。

首先我們需要看一下BufferedInputStream中的flushBuffer方法，該方法是將已經快取的所有的位元組，呼叫輸出流，將位元組重新整理輸出。這個方法會在close()方法和write方法中進行使用。

    private void flushBuffer() throws IOException {
        if (count > 0) {
            out.write(buf, 0, count);
            count = 0;
        }
    }

接下來我們只要來看一下write方法即可：

    public synchronized void write(byte b[], int off, int len) throws IOException {
        if (len >= buf.length) {
            /* If the request length exceeds the size of the output buffer,
               flush the output buffer and then write the data directly.
               In this way buffered streams will cascade harmlessly. */
            flushBuffer();
            out.write(b, off, len);
            return;
        }
        if (len > buf.length - count) {
            flushBuffer();
        }
        System.arraycopy(b, off, buf, count, len);
        count += len;
    }

wirte方法和BufferedInputStream中的read方法相反，read方法為先寫入到快取區陣列中，再輸出給使用者，而write方法為先快取到快取區，當快取區儲存滿了以後再輸出到進行輸出到檔案或其他地方。

在BufferedOutPutStream中，沒有提供出特殊的Close()方法，而是使用了其父類FilterOutPutStream中的Close方法，關閉自己所持有的out物件後，再呼叫flush方法進行重新整理。即，在Close方法呼叫時，將會直接關閉成員屬性out，然後將所有的快取內容寫入到輸出流中。

    public synchronized void flush() throws IOException {
        flushBuffer();
        out.flush();
    }

四、BufferedOutPutStream總結：

快取輸入流是對流中的內容進行批量讀取，然後進行分段輸出給使用者，而快取輸出流，則是將分段輸出的內容，快取到快取區陣列中，然後進行批量寫入到輸出流中。

【原創】從原始碼剖析IO流（三）快取流--轉載請註明出處

【原創】從原始碼剖析IO流（四）管道流--轉載請註明出處

【原創】從原始碼剖析IO流（三）快取流--轉載請註明出處

【原創】從原始碼剖析IO流（一）輸入流與輸出流--轉載請註明出處

【原創】從原始碼剖析IO流（二）檔案流--轉載請註明出處

【原創】基於Spring-SpringMVC-Mybatis 的 Shiro 安全框架使用教程--轉載請註明出處

【原創】Linux虛擬化KVM-Qemu分析（三）之KVM原始碼（1）

【原創】Spring-boot快速入門（二）JPA資料來源--轉載請註明出處

【原創】Spring-boot快速入門（一）HelloWord！--轉載請註明出處

【原創】技術系列之記憶體管理（三）

【原創】Linux PCI驅動框架分析（三）

【原創】python遊戲pygame學習筆記（2）--pie遊戲--還要DEBUG

【原創】大資料基礎之Spark（4）RDD原理及程式碼解析

【原創】大數據基礎之Spark（4）RDD原理及代碼解析

【原創】大資料基礎之Spark（5）Shuffle實現原理及程式碼解析

【原創】大資料基礎之Hive（1）Hive SQL執行過程

【原創】技術系列之狀態機（二）

【原創】技術系列之執行緒（一）

【原創】技術系列之狀態機（一）

【原創】大資料基礎之Spark（6）rdd sort實現原理

【原創】大數據基礎之Spark（7）spark讀取文件split過程（即RDD分區數量）

【原創】從原始碼剖析IO流（三）快取流--轉載請註明出處

相關推薦