Scala Pitfalls: concurrent.Map.getOrElseUpdate is not atomic

A surprising pitfall in Scala's Standard library: Scala's Traits for Concurrent Maps inherit several methods that are unsafe to use in concurrency situation, including getOrElseUpdate.

This problem is discussed in Scala ticket Sl-7943 and has been fixed recently in TrieMap (but not the wrapper of Java's ConcurrentHashMap). As of Scala 2.11.5, the fix for TrieMap has not made it to a release yet.

Background: ConcurrentMap and the putIfAbsent idiom

ConcurrentHashMaps are a popular primitive to build concurrency-friendly code in the Java ecosystem.

Scala provides a wrapper trait and alternative implementation for CHMs in the form of concurrent.Map and TrieMap.

Like the Java version, these traits provide concurrency-safe primitives like putIfAbsent that can be used to implement concurrency-friendly get-or-put idiom in a HashMap.

Classical example for a Thread-safe getOrPut:

// create a concurrent map by wrapping a juc.ConcurrentHashMap
concurrent.Map map = new java.util.concurrent.ConcurrentHashMap[MyKey,MyHandler]()

def getOrCreate(MyKey key): MyHandler: MyHandler = {
  // retrieve the key from the map if present...
  val handler = map.get(myKey).getOrElse {
    // if not present,  allocate a new Handler
    val newHandler = new MyHandler()
    // putIfAbsent atomically inserts the Handler into the map, if not previously
    // installed. Otherwise, returns the previously present entry (and does not modify)
    val maybePresentHandler = map.putIfAbsent(myKey, newHandler)
    // if maybePresentHandler has a value, we got pre-empted => abandon our new Handler
    // and return the present one. Otherwise, return newHandler.
    maybePresentHandler.getOrElse(newHandler)
    }
  }
}

The (present = putIfAbsent(...)) != None is fairly common in concurrency-friendly Java. It is fast and safe. However, the code above is hardly elegant.

GetOrElseUpdate: tempting, but broken

Now, looking at the Scala API doc, it would seem that the getOrElseUpdate method provides a much more elegant alternative to achieve the same goal:

    def getOrElseUpdate(key: A, op:  B): B 
If given key is already in this map, returns associated value.
Otherwise, computes value from given expression op, stores with key in map and returns that value.

This would lend itself to the following code:

// Appealling, but broken - do not use!
def getOrCreateBroken(key: MyKey): MyHandler = {
  map.getOrElseUpdate(key, new MyHandler())
}

However, only a look at the source code reveals, that this does not work. It turns out, getOrElseUpdate is inherited from the base trait MapLike, and not overriden anywhere up to JConcurrentMapMapper which wraps the Scala wrapper around HashMap.

Summary

Scala's concurrent.Map interface (and the corresponding wrapper to Java's ConcurrentHashMap) is problematic, because it inherits a wealths of methods, not all of which are thread-safe. Be very careful when using those maps. Generally speaking, only the methods directly ported from ConcurrentHashMap are safe, while all the convenience methods from Scala's MapLike are not.

You may want to consider just using java.util.ConcurrentHashMap directly.

More generally, this can also serve as a cautionary tale on rich interfaces and traits that provide implementations – while convenient, they can make it very hard to provide proper semantic guarantees in Subclasses. To the point where even the Scala Library designers missed this obvious hole in the standard library.


Functional programming in Scala

I did a short presentation (1.5 h) on functional programming with Scala for the software development teams at one of our customers.

The aim was to give a glimpse of functional programming in general, show some of the features of Scala by example and give a rough idea how domain specific functional code for a life insurance could look like in Scala.

The source code of the slides as well as all code examples are available on GitHub.

See the presentation slides (in german) on SlideShare: