Free Online Courses for Software Developers - MrBool
× Please, log in to give us a feedback. Click here to login
×

You must be logged to download. Click here to login

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

Differences between Binary Serialization and Serialization in Java

See in this article a little comparison about Java binary serialization and the serialization of Java Beans in XML and MySQL.

Hello everybody, in this article we will continue with our discussion about Serialization and Deserialization in Java Objects.

Many developers have doubts when they need to consider a place of persistence for Java objects, since this persistence is required to maintain the state of the application, export, import, send, and receive objects over the network.

The first part of this article focused on the description and the details of what are serialization/deserialization and the situations where it is necessary to persist data in Java. This second part details how the classes used were developed to quantitatively compare the runtime required for serialization/deserialization of objects instantiated from a POJO (Plain Old Java Object) that contains the main basic Java data types.

Binary serialization and deserialization

Binary serialization is the most traditional method for object persistence in Java. This approach uses a binary format that can be read and written using a pair of ObjectInputStrem and ObjectOutputStream objects together with the FileInputStream and FileOutputStream classes to read and write the bytes to and from the file system. In our scenario we are going to read and write the data files from the file system of the operating system according to the value of the fifth parameter passed to the main() method of the class ObjectSerDes, described in the first part of this article.

The data recording is implemented within the SaveObject () method, which receives the object to be recorded and the storage location. The lines to read the data are already coded in the loadObject () method, which takes only the storage location of the files and returns a generic object from the Object class. Listing 1 shows the contents of the SaveObject () and loadObject () methods placed inside the ObjectSerDes class.

Listing 1. Contents of SaveObject () and loadObject () that perform the binary serialization and deserialization, respectively.

// Serialize an object into binary format inside a file
// stored in the location of the second parameter
private static void saveObject(Serializable object, String filename)
            throws IOException 
{
	 ObjectOutputStream objstream = new ObjectOutputStream(
                new FileOutputStream(filename));
       
// The writeObject() method automatically transforms the contents of
       // the object to bytes. 
       // An error is generated if the object does not implement the Serialize interface
	objstream.writeObject(object);
       
	objstream.close();
}

// Deserializes the object stored in the provied path and returns this
// object without any casting it to a specific type 
private static Object loadObject(String filename)
            throws ClassNotFoundException, IOException 
{
       // Open the file for reading
	ObjectInputStream objstream = new ObjectInputStream(
                new FileInputStream(filename));
       
       // Read the bytes and creates the object in memory
Object object = objstream.readObject();

       // Close the file
	objstream.close();

       // Returns the object without casting
       return object;
}

The SaveObject () method first creates an object of the FileOutputStream class using the location and the file name sent in the parameter. The value of the parameter is used to create an object named objstream of the ObjectOutputStream class. Then, the writeObject() method transform the objstream content into bytes that will be serialized. Finally, the close() method closes the file and ends the recording.

The loadObject () method creates a FileInputStream object to read the content of the file passed in its first parameter. This object is used to create the object objstream of the ObjectInputStream class. Then, the code use the readObject () method to load the objstream content in bytes from the object to be serialized, which is stored in the variable named object. Finally, the close () method closes the file and ends the reading. The method loadObject() finalize returning the generic object without any data conversion, which is made by the caller of the loadObject () method.

The binary serialization and deserialization test was performed with 1,000 objects of the MyObjectToSerialize than ran ten times in a row. The average execution time for binary serialization and deserialization in the file system are presented in the graph of Figure 1.

Average execution time for binary serialization and deserialization of 1000 objects

Figure 1. Average execution time for binary serialization and deserialization of 1000 objects

The standard deviation for serialization was 0.14s and the standard deviation for the deserialization was 0.03s. The value of the standard deviation of these two averages is important because it shows the variation from the mean value generated from the summarized values. For example, for deserialization the average was about 0.35s ± 0.03 s, i.e. the average time is between 0.32s and 0.38s.

It is important to note that both operations performed in this test are very fast, because even with 1,000 objects the execution was close to a third of a second. Basically the time required to perform these operations is dependent only on the disk for speed of reading and writing, which makes the binary serialization approach very fast. The serialization was a bit faster than the deserialization probably because of the time required rebuilding the Java object since Java has to do memory allocations and handle pointers in garbage collector required by the removal of object needs when it is no longer used.

Despite being the fastest alternative, binary serialization/deserialization provides few resources for the developer. There are no interoperability features, security, compression, encryption, access control, competition and other handling capabilities offered by other approaches. Basically the objects are stored inside a folder of the operating system as binary files.

XML serialization and deserialization

The XML serialization is widely used for object persistence while working with the development of Web applications, since is usual to create POJOs according to the Java Bean standard to allow the object content to be accessible to other applications. This feature allows interoperability because it is based on the XML standard. The Serialization uses the FileInputStream and FileOutputStream classes to read and write XML files in the same way as binary serialization. However, for XML serialization the developer has to use the XMLEncoder and XMLDecoder classes of java.beans package to encode and decode the file in XML format. The FileInputStream , FileOutputStream , XMLEncoder, and XMLDecoder classes are used inside the saveObjectXM() and loadObjectXML() methods: the former writes the object in s storage folder provided by a parameter and the latter method takes the storage location of the file and returns a generic object from the Object class. Listing 2 shows the contents of saveObjectXML() and loadObjectXML() methods that should be placed inside the ObjectSerDes class.

Listing 2. Contents of saveObjectXML() and loadObjectXML() that perform XML serialization and deserialization.

// Serializes an object into XML format inside a
// the location provided by the second parameter
private static void saveObjectXML(Object object, String filename)
            throws IOException 
{
	FileOutputStream os = new FileOutputStream(filename);
       
       // Creating the object that will encode the object in XML
	XMLEncoder encoder = new XMLEncoder(os);

// Serializing the object in XML
        encoder.writeObject(object);

	
 // Closing the file and saving the data
        encoder.close(); 
    }

// Deserializes the object stored inside a ML file and return it
// without any data type conversion 
private static Object loadObjectXML(String filename)
            throws ClassNotFoundException, IOException {
        
       // Open the file for reading
	FileInputStream os = new FileInputStream(filename);

       // Create the XML decoding object
       XMLDecoder decoder = new XMLDecoder(os);

       // Read the bytes and creates the object in memory
       Object object = decoder.readObject();
       
        // Close the file
	  decoder.close(); 

       // Return the object without converting it to a specifc data type
       return object;
    }

The saveObjectXML() method initially creates an object of FileOutputStream class using the location and file provided by the first parameter. This object is used to create the XMLEncoder object that will encode the data. Next, the method use the writeObject() method to transform to XML the content of the object to be serialized. Finally, the close() method closes the file and finalize the recording.

The loadObjectXML() method creates a FileInputStream object to read the file that is stored as a string in the first parameter. This object is then used to create an instance of the XMLDecoder class. Then, the readObject() method load the XML content of the object to be serialized and stored it on the object variable. Finally, the close() method closes the file and ends the reading. The method ends by returning the value object without any data conversion, as this is the responsibility of the loadObjectXML() caller.

The XML serialization and deserialization test was conducted with 1,000 objects of the MyObjectToSerialize class executed ten consecutive times. The average execution times for XML serialization and deserialization are presented in the graph shown on Figure 2.

Average XML serialization and deserialization times for 1000 objects

Figure 2. Average XML serialization and deserialization times for 1000 objects.

The standard deviation for XML serialization was 0.61s and the standard deviation for the XML deserialization was 0.04s. There wasn’t a wide variation of execution times between the tests because the standard deviation values for the two averages are small compared to the average execution times.

The Listing 3 shows an excerpt of the XML file serialized using this approach. To reduce the display of the XML content, only the first, second, and last position of the byte array was presented (the other values are represented as ellipses).

Listing 3. Excerpt of the XML file format used to serialize the object.

 
<?xml version="1.0" encoding="UTF-8"?> 
<java version="1.6.0_26" class="java.beans.XMLDecoder"> 
 <object class="MyObjectToSerialize"> 
  <void property="FBoll"> 
   <boolean>true</boolean> 
  </void> 
  <void property="FByte"> 
   <array class="byte" length="1024"> 
    <void index="0"> 
     <byte>-120</byte> 
    </void> 
    <void index="1"> 
     <byte>90</byte> 
    </void> 
   ...
    <void index="1023"> 
     <byte>-123</byte> 
    </void> 
   </array> 
  </void> 
  <void property="FDate"> 
   <object class="java.util.Date"> 
    <long>1193922711725087834</long> 
   </object> 
  </void> 
  <void property="FDouble"> 
   <double>0.8222971610084897</double> 
  </void> 
  <void property="FFloat"> 
   <float>0.04313314</float> 
  </void> 
  <void property="FInt"> 
   <int>728545190</int> 
  </void> 
  <void property="FLong"> 
   <long>-1207333529586291961</long> 
  </void> 
  <void property="FString"> 
   <string>XW02FQ6W2KQ6R2V2EWEB0PV0FSQ2XCNN63ACPEK7GJN230PLJO</string> 
  </void> 
  <void property="FUUID"> 
   <string>c91f312f-d4c0-4613-baf3-561e123df925</string> 
  </void> 
 </object> 
</java>

The serialization/deserialization in XML is the option that takes longer and consumes more disk space for file storage. The main reason is that the XML format uses several tags that are repeated throughout the file thus the process demand more time for reading and writing to disk. However, XML is the only option that provides interoperability because any system that works with XML may be able to manipulate the data. As in the binary serialization, there are no features for security, compression, encryption, access control, concurrency, replication, high availability and other that a database provides. Basically XML and binary serialization store files inside a folder of the operating system.

An important aspect that should be taken into consideration is that data in the XML currently is being widely used on NoSQL databases due to the JSON format. There is also a great demand for this data format due to public APIs and Web Services that, by definition, must be able to communicate with different technologies, platforms, and solutions as the XML standard advocates.

The JSON (JavaScript Object Notation) is a lightweight format for exchanging XML-based data. It is a subset of the JavaScript object notation. Currently NoSQL databases such as MongoDB have adopted this format to manipulate the data. One of the claimed advantages of JSON over XML is the fact that it is much easier to write a JSON parser. Using JavaScript a JSON can be parsed trivially using the eval() function and this was important for the acceptance of the JSON within the community due to the presence of this feature inside all current web browsers.

Database serialization and deserialization

The storage of object inside database is not the first choice for many developers, because besides the development of a Java program it is needed to know and maintain a database, which is an extra component of the architecture. There are also the details of the database model to take into consideration. When using a database for data storage one must first create a common data model for organizing types, relationships, and other details. For object serialization inside the database usually a table is created to store an object of a specific class. If you need to store more than one type of object you must model new entities whose tables correspond to the classes of the application model.

In order to store objects created in Java a serialization format must be chosen. Thus, we need to create a table in the database that contains a BLOB (Binary Large Object) column. Listing 4 shows the SQL code that creates the table in MySQL called java_objects to store binary data.

Listing 4. SQL statement to create the table that will contain the Java objects.

CREATE TABLE java_objects
(
	id int,
	object_value blob
) engine = MyISAM;

The code in Listing 4 use the CREATE TABLE statement to create a table with two columns: one called id, with integer data type, and another called object_value, with the BLOB data type. The id column will identify the object and the object_value column will store the content in bytes. The command is completed by the engine option, which specifies the MyISAM engine for the table. This mechanism guarantees the best execution times to read and write without using transactions.

Once the table is created we must configure the connection through a JDBC driver named MySQL Connector/J available at: http://dev.mysql.com/downloads/connector/j/. This driver requires the placement of a JAR file in the folder of the CLASSPATH environment variable or a similar action within an IDE like Eclipse or NetBeans.

To finish the implementation of the ObjectSerDes class we need to implement the following methods to handle the connection and database operations: getConnectionMySQL(), cleanDB(), saveObjectDB(), loadObjectDB() and getBytes().

Initially we code the getConnectionMySQL() method, which will return an object of the Connection class set to establish a connection to the MySQL. Additionally, the cleanDB() method will clean the database by removing all rows of the java_objects table. Listing 5 shows the code of these two methods that need to be inside the ObjectSerDes class.

Listing 5. Code for the GetConnectionMySQL () and cleanDB() methods.

// This method returns a valid MySQL connection
public static Connection getConnectionMySQL() throws Exception, SQLException 
 {
        Connection conn = null;

        // Driver’s name
        String driver = "com.mysql.jdbc.Driver";
	
       // Address, login and password for the MySQL connection
       String url = "jdbc:mysql://localhost/db_obj?user=root&password=my_pass";

        try 
        {
// Checking if the JDBC driver is installed and can be used
               Class.forName(driver);  
			
		
// Opening the connection
               conn = DriverManager.getConnection(url);

        } 
        catch(java.lang.ClassNotFoundException e) 
        {
            System.err.print("ClassNotFoundException: "); 
            System.err.println(e.getMessage());
        }

// Returning the connection
        return conn;
}

// This method deletes the entire contents of the java_objects table 
private static void cleanDB(Connection conn)
      throws IOException, SQLException  
{

// Statement to be sent to MySQL
	String query = "DELETE FROM java_objects";

/ The Statement class is used to send SQL statements to a RDBMS
       Statement stmt= conn.createStatement();                           
        
	
// Send the instruction and forget about the returned result
       stmt.execute(query);
}

The getConnectionMySQL() method creates a variable for the connection that will be returned and two String variables to store the driver’s name, address, database, username and password. Then a try/catch block checks if the driver is installed and opens the connection. The method ends by returning the connection object.

The cleanDB () method should be used when we want to do serialization tests because we need to clean up and delete the entire contents of the table before starting a new test. This method creates a string with the DELETE FROM java_objects statement to delete all rows from the table java_objects and forward this statement to MySQL by creating an object of the class Statement from the Connection object received as a parameter.

The next step is the creation of saveObjectDB(), loadObjectDB() and getBytes() methods that will perform the serialization, deserialization and transformation of the object in bytes to be stored inside the MySQL, respectively. The source code of these methods is shown in Listing 6 and must be created in the ObjectSerDes class.

Listing 6. Source code of the SaveObjectDB(), loadObjectDB() and getBytes() methods.

// This method converts an object into a byte array.
public static byte[] getBytes(Object obj) throws java.io.IOException
    {
// Create the byte array without a size
	byte[] bytes = null;

	
// The ByteArrayOutputStream class is used to write the object’s bytes
        ByteArrayOutputStream bos = new ByteArrayOutputStream();

        try 
        {
       		// Create the ‘output channel’
		ObjectOutputStream oos = new ObjectOutputStream(bos); 
		
		// Writing the bytes and closing the channel
		oos.writeObject(obj);
		oos.flush(); 
		oos.close(); 
		bos.close();
		
		// Get the bytes in the array format  
		bytes = bos.toByteArray ();
        }
        catch (IOException ex) 
        {
            ex.printStackTrace();
        }
      
       // Return the byte array
       return bytes;
    }

// This method saves the object in the table java_objects
private static void saveObjectDB(Object object, Connection conn, int id)
      throws IOException, SQLException  
    {
// Assembling the SQL statement. The ? character will be filled latter on
	String WRITE_OBJECT_SQL = "INSERT INTO JAVA_OBJECTS(ID, OBJECT_VALUE) VALUES (?, ?)";

	
// A PreparedStatement object replaces the ? with by values  in the SQL statement
       PreparedStatement pstmt = conn.prepareStatement(WRITE_OBJECT_SQL);

       
 // Replacing the first ? with the id parameter
        pstmt.setInt(1, id);

       
// Transforming the object in bytes
	byte[] buf = getBytes(object);

       
// Replacing the second ? with the byte array
       pstmt.setObject(2, buf);

       
// Executing the statement
	pstmt.executeUpdate();

	
// Closing the PreparedStatement object
       pstmt.close(); 

    }

// Deserializes the object stored in the database
// Gets the object id and search for it on the java_objects table
private static Object loadObjectDB(Connection conn, int id) throws ClassNotFoundException, IOException, SQLException  
    {
// Assembling the SQL statement. The ? character will be filled latter on
	String READ_OBJECT_SQL = "SELECT OBJECT_VALUE FROM JAVA_OBJECTS WHERE ID = ?";

	
// A PreparedStatement object replaces the ? with by values in the SQL statement
	PreparedStatement pstmt = conn.prepareStatement(READ_OBJECT_SQL);

       
// Replacing the first ? with the id value
       pstmt.setInt(1, id);
        
	
// Executing the statement and getting the result inside a ResultSet object
	ResultSet rs = pstmt.executeQuery();

	
// Setting the read pointer at the beginning
       rs.next();

	// Reading the bytes
       byte[] buf = rs.getBytes(1);

       
// Opening the channel that will convert the bytes into an object
	ObjectInputStream objectIn = null;

       objectIn = new ObjectInputStream(new ByteArrayInputStream(buf));

       
// Converting the bytes to a object
	Object object = objectIn.readObject();

	
// Closing the ResultSet and returning the object without a data type conversion
       rs.close();
       pstmt.close();

       return object;      
    }

The getBytes() method transforms an object into a byte array. This operation is performed by the ObjectOutputStream and ByteArrayOutputStream classes similarly to the SaveObject() method of Listing 1. The difference is that here we use of an object of ByteArrayOutputStream class to transform the set of bytes into an array by using the ToByteArray() method.

The saveObjectDB() method receives as a parameter an object representing the connection with the database, the object that will be serialized and a numeric identifier. The method begins by creating a string with the INSERT statement that will add the data to the MySQL table. This INSERT statement contains two ? characters that will be filled with values through the setInt() and setBytes() methods of the PreparedStatement class. Before calling the setObject() method we need to obtain the byte array through the getBytes() method, which takes as a parameter the object to be serialized and returns an array of bytes. After we create the INSERT statement the executeUpdate() method of the PreparedStatement class sends the SQL to MySQL and the and saveObjectDB() method finishes after calling the close() method.

The last method that must be created is the one that will make the deserialization of the object in MySQL: loadObjectDB(). The first step of this method is the creation of a string that stores the SELECT statement which search clause compares the ID column with the value sent in the method’s second parameter. Again, the SQL statement is created with the character ? because this symbol will be replaced later by a numeric value. An object of the PreparedStatement class is created and the setInt () method replace the ? character with the numerical value of the object identifier that is stored inside the table.

Next, the SELECT statement is executed and the output is stored on a ResultSet object whose function is to read the data returned by the execution of the instruction. This is done via the getBytes() method when we read the first and only column returned by the SELECT statement: the object_values column. The getBytes() provides a byte array that is converted to an object in the way the object was recorded, i.e. using the ObjectInputSream and ByteArrayInputStream classes. Finally, both the RecordSet and the PreparedStatement objects are closed by the close() method and the method finishes by returning the object without any data type conversion. The task of converting the value returned by the loadObjectDB() method to a MyObjectToSerialize object is held by the caller of this method.

The MySQL serialization and deserialization test was conducted with 1,000 objects of the MyObjectToSerialize class that ran ten times in a row. The average execution time for the MySQL serialization and deserialization are presented in the graph shown on Figure 3.

Average MySQL serialization and deserialization times for 1000 objects

Figure 3. Average MySQL serialization and deserialization times for 1000 objects.

The standard deviation for serialization was 0.03s and the standard deviation for the deserialization was 0.25s. As in the binary serialization test, there is a significant variation fort the measurements obtained for MySQL serialization and deserialization. However, considering the standard deviation value of both averages we note that there is statistical confidence in the tests.

The time to serializing using the MySQL (0.17s) approach was less than the time spent for the binary serialization approach (0.29s) probably due to the fact that MySQL needs to allocate disk space while recording individual files and must ask the operating system to allocate this space. The reason is that the deserialization time (0.92s) of the MySQL approach was greater than the time of the deserialization binary approach (0.35s) lies in the fact that MySQL needs to assemble, evaluate, and choose an appropriate execution plan for the SELECT statement.

The MySQL approach produced an execution time close to the binary approach. However when we use the database for serialization/deserialization we got many additional features such as security (access permission objects), encryption, compression, high availability, replication, tools for optimization, dynamic allocation of data, concurrency control, and other that are provided by RDBMS rather than just placing the files inside an operating system a folder.

Analysis of the results

Now that we tested the approaches to binary, XML and MySQL serialization we can compare them and analyze their characteristics to assist the developer when he/she needs to make a choice. Table 1 contains a summary of the comparison between the approaches for serialization/deserialization studied in this article.

Binary MySQL XML
Average serialization time 0.29s 0.17s 86.42s
(1,000 objects)
Average deserialization time 0.35s 0.92s 15.3s
(1,000 objects)
Object size 1.61 KB 1.63 KB 60.2 KB
Has interoperability features? No No Yes
Has adicional No Yes No

Table 1. Comparison of the Binary, MySQL and XML serialization/deserialization of Java objects.

The data in Table 1 shows that the binary and MySQL approaches have the fastest execution time. However, the developer should avoid using only the execution time as the criterion for choosing the approach because there are other important factors that should be also taken into consideration.

The binary serialization had the lowest disk space need (1.61 KB per object). The value of 1.63 KB per row in the database was obtained assuming that a table row contains a column with the integer data type and another column with the BLOB (Binary Large Object) data type. However, for practical reasons the developer must remember that MySQL takes up additional space for metadata and also has a log file for the database. The use of XML files end up occupying more disk space than the other alternatives because of the repetition of XML tags. However, none of the tested approaches used any kind of data compression. There are external tools that allow the compression/decompression of files in the operating system, but this factor can change the execution times.

If we account for interoperability, the XML is the only approach that allows the exchange of data with other applications. This is a factor that can be decisive when choosing the approach to be used, especially if we take into consideration that XML is wide used standard for the implementation of protocols associated with Web applications.

Among all tested approaches the one that allows additional resources is the MySQL. Additional features of this approach include: security, high availability, replication, optimization tools, dynamic data allocation, concurrency control mechanism, and other that are provided by a RDBMS. These resources are generally desirable and considering how close the execution times for the MySQL and Binary approach the MySQL has the best cost/benefit value.

Conclusion

The serialization and deserialization are operations that write and read objects in places other than the memory. These operations are widely used by developers when they need to maintain the state of the system or transfer objects between remote applications.

Therefore, the developers who must choose which approach use to serialize/deserialize need to analyze three different approaches: binary serialization, serialization in XML format and storage of objects inside a relational database.

This second part of the article shows how serialization is performed through the ObjectInputStrem, ObjectOutputStream, FileInputStream, FileOutputStream, XMLEncoder, and XMLDecoder classes. These classes were used to implement the serialization and deserialization operations evaluated by a test set designed to collect data and quantify the comparison between the approaches chosen.

The comparison performed measured for each of the techniques explored the average execution time required to serialize/deserialize a simple POJO, the size of serialized objects, the capacity to interoperate with other applications, and the presence of additional features.

The collected data provide evidence that all three options have advantages and disadvantages in addition to exclusive features. Therefore, the developer must analyze the requirements of their application to seek the best solution based on the detailed characteristics of the serialization/deserialization approaches.



Mauro Pichiliani has the Master of Science degree on collaborative systems by the Aeronatics Institute of Technology (ITA) in Brazil. He is a specialist on database technologies with more than 8 years of experience on the industry...

What did you think of this post?
Services
[Close]
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!


> More info about MrBool Credits
[Close]
You must be logged to download.

Click here to login