Wednesday, September 24, 2008

Declarative Synchronization with Java 5's ReadWriteLock


Don't be scared by the long post! It's only long because of the verbose Java syntax in the code samples... my apologies in advance.



Java 5 included some great concurrency primitives. One of which is the ReadWriteLock, implemented by ReentrantReadWriteLock. It's one object with two locks: a read lock which can be held simultaneously by multiple threads, and a write lock which is exclusive. So concurrency is increased because many threads can read a shared resource while only one thread can write to it. The traditional alternative, the synchronized keyword, provides an exclusive lock no matter what the context. The Javadocs are pretty interesting and worth a read.



Using the thing is quite simple. If you need a read or write lock, you request it and call lock()... making sure to unlock it in a finally block. Imagine a silly class providing String data to callers, backed by a HashMap, which can be refreshed. Getting data (reading) can be done by many clients... but refreshing (writing) should be performed exclusively by a single thread with no read locks held.


import java.util.concurrent.locks.ReentrantReadWriteLock;
import java.util.concurrent.locks.ReadWriteLock;

public class ResourceProvider {

private final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Map<String, String> data = new HashMap<String, String>();

public String getResource(String key) throws Exception {
lock.readLock().lock();
try {
return data.get(key);
} finally {
lock.readLock().unlock();
}
}

public void refresh() throws Exception {
lock.writeLock().lock();
try {
//reload the resources into memory
} finally {
lock.writeLock().unlock();
}
}
}


Is this a good use of ReentrantReadWriteLock? Well, this ResourceProvider class is not immutable, so right off the bat I'm unimpressed with the design. Also, using the synchronized keyword would have been a lot simpler. I would ask, "Is their profiler data to support the additional complexity of the new lock?" Premature optimization and all that... Plus, the lock-finally-unlock idiom is concerning. Looks kinda easy to forget a step in there. Imagine a class with 10 read methods and one refresh method. That'd be a lot of lock-finally-unlocks! Maybe there is some guidance from the JDK...



You know what class I love? AccessController. Its API requires you to give it an Action, which it executes on your behalf. Kinda like how in Groovy you'd just give a closure to an object to execute rather than explicitly get its state. This is the Hollywood Principle: "Don't call us, we'll call you". Don't expose an iterator, expose a method that accepts a closure. Don't expose JVM security privileges, expose a method that accepts a PrivilegedAction. And, in the case of ReadWriteLock, don't expose the locks to be locked and unlocked by the callers; instead, let the caller pass you a function object to execute within the context of either a read or write lock. This is easily written and named ResourceController in this example:


import java.util.concurrent.Callable;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import java.util.concurrent.locks.ReadWriteLock;

public class ResourceController {

private final ReadWriteLock lock = new ReentrantReadWriteLock();

public <T> T withReadLock(Callable<T> block) throws Exception {
lock.readLock().lock();
try {
return block.call();
} finally {
lock.readLock().unlock();
}
}

public <T> T withWriteLock(Callable<T> block) throws Exception {
lock.writeLock().lock();
try {
return block.call();
} finally {
lock.writeLock().unlock();
}
}
}


This hides the complexity of ReentrantReadWriteLock from the callers... the only thing they need to know about is the Callable interface (which they should know about). And you'll never forget to unlock the lock, nor do you have crummy lock-finally-unlock boilerplate littering your codebase. Nice seperation of concerns... get to show off your mad generic skillz... blah, blah, blah. Whether or not the calling code takes a complexity/syntax cruft hit is debatable:

public String getResource(final String key) throws Exception {
return controller.withReadLock(new Callable<String>() {
public String call() throws Exception {
return data.get(key);
}
});
}

public void refresh() throws Exception {
controller.withWriteLock(new Callable<Void>() {
public Void call() throws Exception {
//reload the resources into memory
return null;
}
});
}


Hmmm. This is just way more dense than the finally blocks in the first example. There's just a lot more to take in! There's four lines of boilerplate to support the one line method calls. The advantage of this strategy is that the compiler will check your correctness. You can't "forget" to create an anonymous class the way you might forget a finally block. Closures would possibly help us, if they were available. They are in Groovy, and the code cleans up considerably... the "as Callable" bit is in there to satisfy the ResourceController written in Java:
public String getResource(final String key) throws Exception {
return controller.withReadLock({
return data.get(key)
} as Callable);
}

public void refresh() throws Exception {
controller.withWriteLock({
//reload the resources into memory
return null;
} as Callable);
}


Wouldn't it be nice if you could remove all that boilerplate and just declare your intent with annotations? This version of the ResourceProvider is the cleanest of the bunch... all the locking is moved out of the class and replaced with two annotations: @WithReadLock and @WithWriteLock:


public class DefaultResourceProvider implements ResourceProvider {

private final Map<String, String> data = new HashMap<String, String>();

@WithReadLock
public String getResource(String key) throws Exception {
return data.get(key);
}

@WithWriteLock
public void refresh() throws Exception {
// reload the settings
}
}


The code to make all this work is fairly simple. You need to declare the two annotations and then create a wrapper class used with a Proxy that processes and applies the annotations at runtime. It sounds worse to implement than it is.


First step, create the annotations so they can be applied to methods and read back at runtime:


@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface WithReadLock {

}

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface WithWriteLock {

}


Second step is to create an InvocationHandler that can wrap your object. When a method is called, the invoke() method of the invocation handler is called. It needs to examine the target (the object that was annotated) and look up whether or not any @WithReadLock or @WithWriteLock annotations are present... if they are, execute the method with the appropriate lock held...


import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import java.util.concurrent.locks.ReadWriteLock;

public class ResourceController implements InvocationHandler {

private final Object target;
private final ReadWriteLock lock = new ReentrantReadWriteLock();

ResourceController(Object target) {
this.target = target;
}

public Object invoke(Object proxy, Method method, Object[] args) {
try {

//try to find the method on the target
final Method targetMethod = target.getClass().getMethod(
method.getName(),
method.getParameterTypes());

if (targetMethod.isAnnotationPresent(WithReadLock.class)) {
lock.readLock().lock();
try {
return targetMethod.invoke(target, args);
} finally {
lock.readLock().unlock();
}
} else if (targetMethod.isAnnotationPresent(WithWriteLock.class)){
lock.writeLock().lock();
try {
return targetMethod.invoke(target, args);
} finally {
lock.writeLock().unlock();
}
} else {
return targetMethod.invoke(target, args);
}
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
}


There are a couple downsides to this approach. For one, instantiating a proxied version of ResourceProvider is just nasty:


InvocationHandler handler = new ResourceController(
new DefaultResourceProvider());

ResourceProvider provider = (ResourceProvider) Proxy.newProxyInstance(
DefaultResourceProvider.class.getClassLoader(),
[ ResourceProvider.class ] as Class[],
handler);


Yuck! But did you notice how ResourceProvider got renamed? It is now called DefaultResourceProvider and implements ResourceProvider. This is because only interfaces can be proxied. Not ideal. Groovy allows you to wrap non-final classes in a variety of ways (ProxyGenerator, Proxy-o-Matic, and invokeMethod), but it was not obvious to me how to cleanly process these annotations with these approaches. Anyway, you're not seriously looking at applying Groovy to a concurrency issue that has been proven to be so high performance that a ReadWriteLock is even needed, are you? A last option to consider is Spring AOP to proxy the thing... but locking is an implementation detail and hardly a cross-cutting concern. It seems a very bad fit for this bahavior.



But is it even a good idea to seperate the locking code from the implementation this way? It is both possible and easy to create a ResourceProvider that is unusable and does not fulfill the contract it's meant to provide. Without being proxied, the object is downright dangerous to your system. Do you want your system and compiler to allow you to create an object in this state? I'm very hesitant to say yes. The JPA transactional annotations are great because your framework generally handles enforcing it. And web service annotations work great for the same reason. But there is no framework that enforces @WithReadLock. It just seems incredibly dangerous and I have to recommend the middle approach... requiring an anonymous inner class and hiding the ReentrantReadWriteLock behing a ResourceController abstration. Sorry the syntax is so verbose, but it seems the right design decision.



So there you have it: Declarative Synchronization with Java 5's ReadWriteLock which I don't even recommend using. Do you feel slightly cheated?


Source and tests available

4 comments:

Unknown said...

Good article but your example of allowing n-reads and 1-write to a hashmap can cause troubles. HashMap is not designed for any type of concurrent access, end of story. Others have blogged about this:

http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html

I recommend you use something else, something that you know works well under a multi-read single write scenario.

Marcio Aguiar said...

Nice! I really like these new annotations uses. There's going to be a new bunch in the JDK 7 release.

I have one suggestion: use aspects to insert the code in methods with these annotations. It's less intrusive. I use this technique to wrap methods in a database transaction.

Taylor said...

Cool. Terracotta has to deal with this all the time in the context of cluster reads/writes.

In addition to regular synchronized semantics and ReentrantReadWriteLocks, we also support converting normal synchronized code declarations into read/write locks - and you can do that using annotations.

Of course it's all in the context of clustering, so it's not a general technique, but might be interesting to you and/or readers of this post nonetheless.

I blogged about how to convert synchronized into read locks using Terracotta here: Stupid JVM Tricks - Read locks from just synchronized.

Documentation for Terracotta annotations are here: Terracotta Annotations

Hamlet D'Arcy said...

@Marcio - I just can't bring myself to think the annotation solution is a good idea. Any class /should/ work on it's own without failing. Otherwise it's not a pojo and it's just unsafe to use. Separating the synchronization policy of your class into a second location while not enforcing that that policy be used is just dangerous. Separation of concerns is good... but in this case I feel it's gone too far. These things shouldn't be separated.