11/6/11

Deep copy & Shallow copy - Why basics matter

Recently, we ran into an issue in our application, a weird bug that was not easily reproducible. It goes like this:

When I try to look at data for User X, the application sometimes pulls up the details of X and sometimes it gets me the data for some other random User Y. There was no relation between X and Y. This was happening more under load and it was pretty difficult to reproduce.

After digging through the freaking code for hours and hours we finally figured out the problem.It happened for two reasons:

First Reason:
The programmer who wrote the code was careless or lacked a clear understanding of programming fundamentals.

Second Reason:
A shallow copy of the data was being returned instead of a deep copy.

Explanation:
We were using the memory cache to cache the common user metadata from the database so as to cut down the load on the database servers. We then created a class (a Singleton) to encapsulate the access to the cache. This class has an instance of a cache object that talks to the actual memory. The getters in this instance returned shallow copies of data i.e actual references to memory. It was the responsibility of the Singleton class to create a deep copy of the data before returning it to the calling code.

This critical piece of code was missing and so the Singleton was essentially returning a reference to the data in cache.The code that called the Singleton was then modifying the shallow copy of the metadata, unaware that it was actually modifying the values in cache. So when User Y requested data, since the class was a singleton(and therefore using the same cache instance as well), it was returning a modified version of the metadata. When this metadata was combined with other data, specific to user Y, it resulted in weird values and as a result the user ended up seeing some other data.

Solution:
The fix was very simple. In the Singleton class, we had to create a deep copy of the object returned from the cache, before passing it to the calling code, so any changes made to that object did not impact the original values in cache.