ThreadLocal to rescue issues caused by bad code for caching

Lot of code written to enhance performance of APIs relies on caching. Caching is a general strategy used to avoid hitting the DB and data is saved in a cache sometimes for 15-30 minutes.

Now this cache is sometimes a central one like Redis, Memcache etc, or it can be a jvm cache like Map or a List which keeps the data for sometime.

Distributed caches are hosted on separate servers, and there is still a cost associated with hitting it so developers are generally more careful hitting it all the time to get data.

In case the developer overuses the distributed cache, the incremental hits get reported and caught in performance testing by QA, or by infra teams via multiple dashboards which they use to monitor.

The problem is the jvm cache, here the developers get lazy and whenever they need the object, they know its there in the jvm cache and they just pick it up from jvm and do not bother passing it to the functions where its needed.

In a multi server environment the jvm cache creates a very bad problem where an object is cached on separate servers, and the servers do an update without first fetching the latest value from the DB. The first update gets overwritten by the second one with the stale value picked from cache. This is generally rectified by using a central cache and avoiding jvm cache altogether.

Example:

Here the developer is picking up userDetails from jvm in different functions, this does not create any performance problem until we decide to move the cache to a distributed caching solution. If we just move the jvm cache to redis cache, there will be 5 redis hits for each call to collectPayment.

The correct way to move the cache to outside of jvm and to pass the cached value to functions down the line. This is clean code, but requires a regression of full system and a update of all the unit tests associated with functions for which signature is changing, and will create only 1 redis hit for 1 call to collectPayment

Workaround where the signature of the methods do not change: We do keep the object in jvm, but now we are keeping its scope to just this thread. This object now gets cleaned up as soon the call finishes and a new copy is fetched from Redis next time we need it. The good part here is that the method signatures do not change, and they can still call the updated method to get the object. This saves lot of effort and the code can be corrected slowly with other enhancements. This code will make only 1 redis call per call to collectPayment.

Reference: https://docs.oracle.com/javase/7/docs/api/index.html?java/lang/ThreadLocal.html

ThreadLocal to rescue issues caused by bad code for caching

Related Posts