Java deserialization speed

I am writing a Java application that, among other things, has to read a text dictionary file (each line is one word) and store it in a HashSet. Every time I start the application, the same file is read again (6MB Unicode file).

It seemed expensive, so I decided to serialize the resulting HashSet and store it in a binary. I expected my application to run faster after this. Instead, it got slower: from ~ 2.5 seconds to ~ 5 seconds after serialization.

Is this the expected result? I thought that in cases like this, serialization should increase speed.

+2


a source to share


2 answers


It is not a matter of one serialization mechanism, it is a matter of the data structure you are serializing.

You have a very efficient, natural representation of these words: a simple list in a text file. It's quick to read.

You have created a data structure to store them that is different: a hash table. More memory is required to represent the hash table. However, the advantage is that word searches are very fast compared to a simple list.



But this trade-off means that serialization will also slow down, as naive serialization of the hash table will serialize more data and be larger and therefore slower.

I think you should stick to simple text file reading.

+5


a source


@Correct answer. Java serialization / deserialization has significant overhead. If you need to speed up the loading of the dictionary (or ...), consider the following approaches:



  • Using classes java.nio.*

    to read the file can speed up the process.
  • If your application does not necessarily require the dictionary to be loaded immediately at startup, consider using a separate thread to enter the dictionary asynchronously. Loading the dictionary is not faster, but (for example) the application's GUI launches anyway.
+2


a source







All Articles