1. 程式人生 > >深入JAVA序列化反序列化

深入JAVA序列化反序列化

轉換 zed 一個 源代碼 () bili 什麽 知識點 dom

前言

java序列化與反序列化應該是非常基本的知識點,但細想起來還是一頭霧水,    
不知道序列化與反序列化到底底層是如何實現的,所以特意花了些時間整理這篇文章。
所以你如果還只是停留在使用和知道這麽一個知識點那麽這篇文章對你有一定幫助,
看完這篇文章你能得到哪些東西呢?我的本文也是根據這些東西層層遞進進行書寫,歸總為如下幾條:

    1.序列化與反序列化的概念
    2.序列化與反序列化的實現與原理
    3.為什麽要序列化與反序列化,它的應用場景是什麽?
    4.序列化與反序列化底層是如何實現的?
    5.閱讀序列化反序列化源代碼

廢話不多說,擼起袖子就是開寫。


1.序列化與反序列化的概念

序列化:Java**對象**轉換為**字節序列**的過程。
反序列化:Java把**字節序列**恢復為**對象**的過程。

2.序列化與反序列化的實現與原理

對象可序列化條件:
一個類的對象要想序列化成功,必須滿足兩個條件:
該類必須實現 java.io.Serializable 對象。
該類的所有屬性必須是可序列化的。如果有一個屬性不是可序列化的,則該屬性必須註明是短暫的
如果你想知道一個 Java 標準類是否是可序列化的,請查看該類的文檔。檢驗一個類的實例是否能序列化十分簡單, 只需要查看該類有沒有實現 java.io.Serializable接口。下面是一個序列化的代碼:

// 序列化代碼

// Employee.java
class Employee implements Serializable {
    public String name;
    public String address;
        //該屬性為不可序列化的,所以聲明為短暫的
    public transient int SSN;
    public Number number;
}

// 測試類
public class DeserializeTest
{

    public static void  main(String[] args) {
        Employee e = new Employee();
        e.name = "MikeHuang";
        e.address = "XXXXXXXXXXXX";
        e.SSN = 12345678;
        e.number = 110;
        try
        {
            FileOutputStream fileOut =
                    new FileOutputStream("/tmp/employee.ser");
            ObjectOutputStream out = new ObjectOutputStream(fileOut);
            out.writeObject(e);
            out.close();
            fileOut.close();
            System.out.printf("Serialized data is saved in /tmp/Employee.ser");
        }catch(IOException i)
        {
            i.printStackTrace();
        }
    }
}
上面測試代碼執行後會在/tmp目錄下多出一個employee.ser文件,我於是好奇打開了這文件,看裏面都是什麽內容。裏面還是能夠看出一些信息的比如說類名、可序列化的屬性名、可序列化屬性的值與類型。你可以自行打開查看。

那麽通過上面的代碼我們總結一下,如何實現對象序列化:
    a.必須實現 java.io.Serializable,所有屬性必須是可序列化的,屬性不是可序列化的,則該屬性必須註明是短暫的
    b.通過ObjectOutputStream對象的writeObject方法將對象轉換為字節序列。
writeObject的源碼,會在5.源碼上貼出。

// 反序列化代碼

   // 反序列化
    public static void main(String[] args) {
        Employee e = null;
        try
        {
            FileInputStream fileIn = new FileInputStream("/tmp/employee.ser");
            ObjectInputStream in = new ObjectInputStream(fileIn);
            e = (Employee) in.readObject();
            in.close();
            fileIn.close();
        }catch(IOException i)
        {
            i.printStackTrace();
            return;
        }catch(ClassNotFoundException c)
        {
            System.out.println("Employee class not found");
            c.printStackTrace();
            return;
        }
        System.out.println("Deserialized Employee...");
        System.out.println("Name: " + e.name);
        System.out.println("Address: " + e.address);
        System.out.println("SSN: " + e.SSN);
        System.out.println("Number: " + e.number);
    }
通過上面的代碼,執行後SSN的值輸出為0,這是因為該屬性聲明為暫時的,所以它是不可序列化的屬性,
也就沒有保存在employee.ser中,你可以通過打開搜索SSN關鍵字來確認。在反序列化時,
該屬性的值110也就沒有,而是0.
通過上面的代碼我們可以知道
要將**字節序列轉化為對象,需要使用ObjectInputStream的readObject方法**,具體readObject源碼我們會在5處進行貼出來。

3.為什麽要序列化與反序列化,它的應用場景是什麽以及註意事項
說白了就是這東西能幹嘛用?有什麽便利提供給我們。總的來說可以歸結為以下幾點:
(1)永久性保存對象,保存對象的字節序列到本地文件或者數據庫中;
(2)通過序列化以字節流的形式使對象在網絡中進行傳遞和接收;
(3)通過序列化在進程間傳遞對象;

註意事項:
1.序列化時,只對對象的狀態進行保存,而不管對象的方法;
2.當一個父類實現序列化,子類自動實現序列化,不需要顯式實現Serializable接口;
3.當一個對象的實例變量引用其他對象,序列化該對象時也把引用對象進行序列化;
4.並非所有的對象都可以序列化,至於為什麽不可以,有很多原因了,比如:
安全方面的原因,比如一個對象擁有private,public等field,對於一個要傳輸的對象,比如寫到文件,或者進行RMI傳輸等等,在序列化進行傳輸的過程中,這個對象的private等域是不受保護的;
資源分配方面的原因,比如socket,thread類,如果可以序列化,進行傳輸或者保存,也無法對他們進行重新的資源分配,而且,也是沒有必要這樣實現;
5.聲明為static和transient類型的成員數據不能被序列化。因為static代表類的狀態,transient代表對象的臨時數據。
6.序列化運行時使用一個稱為 serialVersionUID 的版本號與每個可序列化類相關聯,該序列號在反序列化過程中用於驗證序列化對象的發送者和接收者是否為該對象加載了與序列化兼容的類。為它賦予明確的值。顯式地定義serialVersionUID有兩種用途:
在某些場合,希望類的不同版本對序列化兼容,因此需要確保類的不同版本具有相同的serialVersionUID;
在某些場合,不希望類的不同版本對序列化兼容,因此需要確保類的不同版本具有不同的serialVersionUID。
7.Java有很多基礎類已經實現了serializable接口,比如String,Vector等。但是也有一些沒有實現serializable接口的;
8.如果一個對象的成員變量是一個對象,那麽這個對象的數據成員也會被保存!這是能用序列化解決深拷貝的重要原因;

4.序列化與反序列化底層是如何實現的?

這一節,我們帶著問題去一個一個的解開謎團,看到代碼的最終本質,從而加深我們隊序列化反序列化的理解。

a.為什麽一定要實現這個java.io.Serializable才能序列化?
我們可以通過去除Employee的Serializable實現,你會發現執行報異常,
異常如下:

java.io.NotSerializableException: Employee
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1180)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346)
    at DeserializeDemo.saveSer(DeserializeDemo.java:25)
    at DeserializeDemo.main(DeserializeDemo.java:64)

那麽我們就去看下writeObject0中的代碼片段如下:

                        // remaining cases
            if (obj instanceof String) {
                writeString((String) obj, unshared);
            } else if (cl.isArray()) {
                writeArray(obj, desc, unshared);
            } else if (obj instanceof Enum) {
                writeEnum((Enum) obj, desc, unshared);
            } else if (obj instanceof Serializable) {
                writeOrdinaryObject(obj, desc, unshared);
            } else {
                if (extendedDebugInfo) {
                    throw new NotSerializableException(
                        cl.getName() + "\n" + debugInfoStack.toString());
                } else {
                    throw new NotSerializableException(cl.getName());
                }
            }

會判斷要被序列化的類是否是String、Enum、Array和Serializable類型,如果不是則直接拋出NotSerializableException。

b.String、Enum、Array都實現了Serializable為啥要單獨拿出來進行序列化呢?
其實這個也很簡單,因為ObjectOutPutStream給String、Enum、Array對象數據結構已經做了特殊的序列化的方法,
而除了上述三個外,唯一能夠實現的就是通過實現Serializable來達到序列化。

c.Serializable這個東西一定要,那必須了解一下,這裏面到底是個啥樣的?

/**
 * 說明文字已經去掉了,如果要看可以自行查看源碼,
 * 其實這裏的說明也說明了如何實現序列化。
 * @author  unascribed
 * @see java.io.ObjectOutputStream
 * @see java.io.ObjectInputStream
 * @see java.io.ObjectOutput
 * @see java.io.ObjectInput
 * @see java.io.Externalizable
 * @since   JDK1.1
 */
public interface Serializable {
}

這只是一個空接口,實現這個接口只是為了標識為可序列化,所有實現了這個接口的對象,都會有一個serialVersionUID,這個東西使用與確定序列化與反序列化是否匹配的一個標識。具體的說明在 Serializable接口
中有說明,我把這部分貼出來如下,如果需查看全部,請進入源碼自行查看:

/**
* This readResolve method follows the same invocation rules and
 * accessibility rules as writeReplace.<p>
 *
 * The serialization runtime associates with each serializable class a version
 * number, called a serialVersionUID, which is used during deserialization to
 * verify that the sender and receiver of a serialized object have loaded
 * classes for that object that are compatible with respect to serialization.
 * If the receiver has loaded a class for the object that has a different
 * serialVersionUID than that of the corresponding sender‘s class, then
 * deserialization will result in an {@link InvalidClassException}.  A
 * serializable class can declare its own serialVersionUID explicitly by
 * declaring a field named <code>"serialVersionUID"</code> that must be static,
 * final, and of type <code>long</code>:<p>
 *
 * <PRE>
 * ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;
 * </PRE>
 *
 * If a serializable class does not explicitly declare a serialVersionUID, then
 * the serialization runtime will calculate a default serialVersionUID value
 * for that class based on various aspects of the class, as described in the
 * Java(TM) Object Serialization Specification.  However, it is <em>strongly
 * recommended</em> that all serializable classes explicitly declare
 * serialVersionUID values, since the default serialVersionUID computation is
 * highly sensitive to class details that may vary depending on compiler
 * implementations, and can thus result in unexpected
 * <code>InvalidClassException</code>s during deserialization.  Therefore, to
 * guarantee a consistent serialVersionUID value across different java compiler
 * implementations, a serializable class must declare an explicit
 * serialVersionUID value.  It is also strongly advised that explicit
 * serialVersionUID declarations use the <code>private</code> modifier where
 * possible, since such declarations apply only to the immediately declaring
 * class--serialVersionUID fields are not useful as inherited members. Array
 * classes cannot declare an explicit serialVersionUID, so they always have
 * the default computed value, but the requirement for matching
 * serialVersionUID values is waived for array classes.
 * */

舉例:String
private static final long serialVersionUID = -6849794470754667710L;

d.Employee類中沒有書寫,那麽它是什麽時候加上版本號的呢?
當實現java.io.Serializable接口的類沒有顯式地定義一個serialVersionUID變量時候,Java序列化機制會根據編譯的Class自動生成一個serialVersionUID作序列化版本比較用,這種情況下,如果Class文件(類名,方法明等)沒有發生變化(增加空格,換行,增加註釋等等),就算再編譯多次,serialVersionUID也不會變化的。如果不顯示的去寫版本號,那麽就可能造成反序列化時,因為類改變了(怎加了方法,修改了方法名等)而生成了不一樣的版本號,那麽原先序列化的字節序列將無法轉成該版本的對象,因為版本不一致嘛。所以一定要顯示的去設置版本號。

e.如果定制序列化策略,該如何實現呢?
回答這個問題前,我們先來看下數組(ArrayList)這個類。

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer.
     */
    private transient Object[] elementData;

    /**
     * The size of the ArrayList (the number of elements it contains).
     *
     * @serial
     */
    private int size;

}

private transient Object[] elementData;說明這個數據是臨時數據,不能序列化的,但實際上操作,我們卻能夠序列化。這是為什麽?

在序列化過程中,如果被序列化的類中定義了writeObject 和 readObject 方法,虛擬機會試圖調用對象類裏的 writeObject 和 readObject 方法,進行用戶自定義的序列化和反序列化。
如果沒有這樣的方法,則默認調用是 ObjectOutputStream 的 defaultWriteObject 方法以及 ObjectInputStream 的 defaultReadObject 方法。
用戶自定義的 writeObject 和 readObject 方法可以允許用戶控制序列化的過程,比如可以在序列化的過程中動態改變序列化的數值。

5.閱讀序列化反序列化源代碼

// 序列化源碼

   /**
     * Write the specified object to the ObjectOutputStream.  The class of the
     * object, the signature of the class, and the values of the non-transient
     * and non-static fields of the class and all of its supertypes are
     * written.  Default serialization for a class can be overridden using the
     * writeObject and the readObject methods.  Objects referenced by this
     * object are written transitively so that a complete equivalent graph of
     * objects can be reconstructed by an ObjectInputStream.
     *
     * <p>Exceptions are thrown for problems with the OutputStream and for
     * classes that should not be serialized.  All exceptions are fatal to the
     * OutputStream, which is left in an indeterminate state, and it is up to
     * the caller to ignore or recover the stream state.
     *
     * @throws  InvalidClassException Something is wrong with a class used by
     *          serialization.
     * @throws  NotSerializableException Some object to be serialized does not
     *          implement the java.io.Serializable interface.
     * @throws  IOException Any exception thrown by the underlying
     *          OutputStream.
     */
    public final void writeObject(Object obj) throws IOException {
        if (enableOverride) {
            writeObjectOverride(obj);
            return;
        }
        try {
            writeObject0(obj, false);
        } catch (IOException ex) {
            if (depth == 0) {
                writeFatalException(ex);
            }
            throw ex;
        }
    }

   /**
     * Underlying writeObject/writeUnshared implementation.
     */
    private void writeObject0(Object obj, boolean unshared)
        throws IOException
    {
        boolean oldMode = bout.setBlockDataMode(false);
        depth++;
        try {
            // handle previously written and non-replaceable objects
            int h;
            if ((obj = subs.lookup(obj)) == null) {
                writeNull();
                return;
            } else if (!unshared && (h = handles.lookup(obj)) != -1) {
                writeHandle(h);
                return;
            } else if (obj instanceof Class) {
                writeClass((Class) obj, unshared);
                return;
            } else if (obj instanceof ObjectStreamClass) {
                writeClassDesc((ObjectStreamClass) obj, unshared);
                return;
            }

            // check for replacement object
            Object orig = obj;
            Class cl = obj.getClass();
            ObjectStreamClass desc;
            for (;;) {
                // REMIND: skip this check for strings/arrays?
                Class repCl;
                desc = ObjectStreamClass.lookup(cl, true);
                if (!desc.hasWriteReplaceMethod() ||
                    (obj = desc.invokeWriteReplace(obj)) == null ||
                    (repCl = obj.getClass()) == cl)
                {
                    break;
                }
                cl = repCl;
            }
            if (enableReplace) {
                Object rep = replaceObject(obj);
                if (rep != obj && rep != null) {
                    cl = rep.getClass();
                    desc = ObjectStreamClass.lookup(cl, true);
                }
                obj = rep;
            }

            // if object replaced, run through original checks a second time
            if (obj != orig) {
                subs.assign(orig, obj);
                if (obj == null) {
                    writeNull();
                    return;
                } else if (!unshared && (h = handles.lookup(obj)) != -1) {
                    writeHandle(h);
                    return;
                } else if (obj instanceof Class) {
                    writeClass((Class) obj, unshared);
                    return;
                } else if (obj instanceof ObjectStreamClass) {
                    writeClassDesc((ObjectStreamClass) obj, unshared);
                    return;
                }
            }

            // remaining cases
            if (obj instanceof String) {
                writeString((String) obj, unshared);
            } else if (cl.isArray()) {
                writeArray(obj, desc, unshared);
            } else if (obj instanceof Enum) {
                writeEnum((Enum) obj, desc, unshared);
            } else if (obj instanceof Serializable) {
                writeOrdinaryObject(obj, desc, unshared);
            } else {
                if (extendedDebugInfo) {
                    throw new NotSerializableException(
                        cl.getName() + "\n" + debugInfoStack.toString());
                } else {
                    throw new NotSerializableException(cl.getName());
                }
            }
        } finally {
            depth--;
            bout.setBlockDataMode(oldMode);
        }
    }

備註:
(1)將對象實例相關的類元數據輸出。
(2)遞歸地輸出類的超類描述直到不再有超類。
(3)類元數據完了以後,開始從最頂層的超類開始輸出對象實例的實際數據值。
(4)從上至下遞歸輸出實例的數據

//反序列化源碼

   /**
     * Read an object from the ObjectInputStream.  The class of the object, the
     * signature of the class, and the values of the non-transient and
     * non-static fields of the class and all of its supertypes are read.
     * Default deserializing for a class can be overriden using the writeObject
     * and readObject methods.  Objects referenced by this object are read
     * transitively so that a complete equivalent graph of objects is
     * reconstructed by readObject.
     *
     * <p>The root object is completely restored when all of its fields and the
     * objects it references are completely restored.  At this point the object
     * validation callbacks are executed in order based on their registered
     * priorities. The callbacks are registered by objects (in the readObject
     * special methods) as they are individually restored.
     *
     * <p>Exceptions are thrown for problems with the InputStream and for
     * classes that should not be deserialized.  All exceptions are fatal to
     * the InputStream and leave it in an indeterminate state; it is up to the
     * caller to ignore or recover the stream state.
     *
     * @throws  ClassNotFoundException Class of a serialized object cannot be
     *          found.
     * @throws  InvalidClassException Something is wrong with a class used by
     *          serialization.
     * @throws  StreamCorruptedException Control information in the
     *          stream is inconsistent.
     * @throws  OptionalDataException Primitive data was found in the
     *          stream instead of objects.
     * @throws  IOException Any of the usual Input/Output related exceptions.
     */
    public final Object readObject()
        throws IOException, ClassNotFoundException
    {
        if (enableOverride) {
            return readObjectOverride();
        }

        // if nested read, passHandle contains handle of enclosing object
        int outerHandle = passHandle;
        try {
            Object obj = readObject0(false);
            handles.markDependency(outerHandle, passHandle);
            ClassNotFoundException ex = handles.lookupException(passHandle);
            if (ex != null) {
                throw ex;
            }
            if (depth == 0) {
                vlist.doCallbacks();
            }
            return obj;
        } finally {
            passHandle = outerHandle;
            if (closed && depth == 0) {
                clear();
            }
        }
    }

  /**
     * Underlying readObject implementation.
     */
    private Object readObject0(boolean unshared) throws IOException {
        boolean oldMode = bin.getBlockDataMode();
        if (oldMode) {
            int remain = bin.currentBlockRemaining();
            if (remain > 0) {
                throw new OptionalDataException(remain);
            } else if (defaultDataEnd) {
                /*
                 * Fix for 4360508: stream is currently at the end of a field
                 * value block written via default serialization; since there
                 * is no terminating TC_ENDBLOCKDATA tag, simulate
                 * end-of-custom-data behavior explicitly.
                 */
                throw new OptionalDataException(true);
            }
            bin.setBlockDataMode(false);
        }

        byte tc;
        while ((tc = bin.peekByte()) == TC_RESET) {
            bin.readByte();
            handleReset();
        }

        depth++;
        try {
            switch (tc) {
                case TC_NULL:
                    return readNull();

                case TC_REFERENCE:
                    return readHandle(unshared);

                case TC_CLASS:
                    return readClass(unshared);

                case TC_CLASSDESC:
                case TC_PROXYCLASSDESC:
                    return readClassDesc(unshared);

                case TC_STRING:
                case TC_LONGSTRING:
                    return checkResolve(readString(unshared));

                case TC_ARRAY:
                    return checkResolve(readArray(unshared));

                case TC_ENUM:
                    return checkResolve(readEnum(unshared));

                case TC_OBJECT:
                    return checkResolve(readOrdinaryObject(unshared));

                case TC_EXCEPTION:
                    IOException ex = readFatalException();
                    throw new WriteAbortedException("writing aborted", ex);

                case TC_BLOCKDATA:
                case TC_BLOCKDATALONG:
                    if (oldMode) {
                        bin.setBlockDataMode(true);
                        bin.peek();             // force header read
                        throw new OptionalDataException(
                            bin.currentBlockRemaining());
                    } else {
                        throw new StreamCorruptedException(
                            "unexpected block data");
                    }

                case TC_ENDBLOCKDATA:
                    if (oldMode) {
                        throw new OptionalDataException(true);
                    } else {
                        throw new StreamCorruptedException(
                            "unexpected end of block data");
                    }

                default:
                    throw new StreamCorruptedException(
                        String.format("invalid type code: %02X", tc));
            }
        } finally {
            depth--;
            bin.setBlockDataMode(oldMode);
        }
    }

下面是知識擴展:有興趣的同學可以看看,非常的不錯。
美團技術團隊:序列化與反序列化

深入JAVA序列化反序列化