Be Careful of Deserialization

A bit over a week ago, Foxglove Security posted an article about security issues with Apache Commons-Collections. More generally, though, the post covers problems with serialization and deserialization in Java. While the post tends to focus on server-side issues, Android apps can run into the same problems.

We often have to convert some sort of packed data structure into a corresponding object model. That data structure could be JSON, protobuf, XML, and countless others… including Java’s own serialization format. If you have used an ObjectInputStream to read in data, you are using Java’s own serialization format.

If you are 100% certain that the data that you are deserializing is your own data and has not been tampered with, there are no problems. The difficulty arises when developers are incorrect in their assessment, and that the serialized data can be replaced by something else:

  • through a MITM (Martian-in-the-middle) attack when downloading serialized data from some server

  • by malware replacing serialized data that you have in publicly-accessible locations on disk (e.g., external storage)

  • by malware running with superuser privileges replacing serialized data that you have on internal storage

  • by design, because the app is supposed to be downloading serialized data from arbitrary sources

If your application code, using the deserialized data, does something unsafe with that data. In the particular case of the article, they focus on the readObject() method that you can implement on a Serializable class, and in particular they focus on Apache Commons-Collections for having a remote code execution vulnerability in a readObject() method. An attacker can craft a serialized payload that you deserialize that, through the exploit, does what the attacker wants.

While the article focuses on Serializable, the same sorts of problems can happen with any sort of data that you are parsing. For example, I get very nervous when I see code that reads in JSON and then uses Java reflection to go create instances of classes referred to in the JSON. A hacked version of the JSON could create an instance of anything, and while your code may crash as a result (e.g., ClassCastException), if there is some vulnerability in that class’ constructor, it may be too late.

Given MITM and root-level attacks, it is difficult to ensure that your data has not been tampered with. In some cases perhaps a digital signature can be used, but many times that is not possible.

Beyond that, reduce your attack surface. Do not use Java serialization, as you are not in control over how the deserializing occurs. Use something else where you declare what the rules are for converting the data into the object model, even if those rules are merely references to fields (e.g., default Gson parsing rules). Apply your own validation to the input wherever possible before creating your object model, to ensure that you are parsing something that seems to be legitimate. And consider any objects that you are using to hold the parsed data to be part of your public API and subject them to the same sort of scrutiny that you would for any other public API, with respect to potential vulnerabilities.