Friday, December 17, 2010

CopyOnWrite Wrappers Part 2 (now CIS Wrappers)

There was some good feedback on my first pass. To summarize:


1. The implementation is not a "true" CopyOnWrite. It is more of a "concurrent iteration safe wrapper". Implying that it is CopyOnWrite will confuse users.

2. Does the code pass the JSR-166 Unit tests?

3. For speed: Why are the getters synchronized? Could you use atomics?

4. What are you trying to solve, why not use a (some other class)?


I have updated the code, at http://flyingspaniel.wikidot.com/cow to address those comments:

1. To better indicate that these are not true CopyOnWrite, they are now named CISListWrapper and CISMapWrapper, where CIS stands for "Concurrent iteration safe". There are improved comments. Users who are familiar with CopyOnWrite behavior should not be confused or disappointed.

2. In addition to my own unit test (CISWrapperTest.java) I modified some of the JSR 166 unit tests, renaming them, making them more generic, and taking out a few tests (serialization) that didn't really apply (or work for me). This required a few changes in my code - mainly adding equals(), hashCode() and toString() methods that I had neglected.

3. As implemented, the reads do need to be synchronized. (also here). I wasn't concerned about ultimate speed, and this has the tremendous advantage of being more "idiot-proof". The wrapped Collection need not be synchronized. Removing the synchronization might require a user to add a Collections.synchronizedXXX() wrapper around the underlying collection, adding one more layer and eliminating any speed benefit.

I did consider changing the synchronization from the wraper to the wrapee. That is, instead of the current

public synchronized boolean containsKey(Object key) {
return wrapee.containsKey(key);
}

use instead:

public boolean containsKey(Object key) {
synchronized(wrapee) {
return wrapee.containsKey(key);
}
}

This is "left as an exercise for the reader" and might offer modest speed increases if the underlying collection were itself synchronized. If you really want the utmost in speed on simple reads, extend or use something like ConcurrentHashMap, or use one of the "true" CopyOnWriteMaps:

org.apache.mina.util.CopyOnWriteMap



4. So why use the CIS code? The "true" CopyOnWrite wrappers make a copy of your Map, and, for the most part, they copy the data into a basic HashMap or TreeMap. If that behavior is suitable, you are better off using their wrappers. Of course, if that basic behavior is suitable and you are seeking ultimate speed, you could consider rewriting your code to use the "standard" java.util.concurrent.ConcurrentHashMap.

If you have an existing List or Map that is not thread-safe or iteration-safe, and it has special or complex behavior that is not a simple HashMap or TreeMap, then this pure wrapper is useful. For example, you have a class com.mycompany.FunkyMap and it:
  1. validates inputs and throws Exceptions
  2. has special behavior for null keys or values
  3. has unusual rules for sorting
  4. implements some security rules on puts and gets. (and throws Exceptions)
  5. logs stuff
  6. encrypts values and stores them on a database
  7. is some facade or proxy (say, for an ORM)
  8. is buggy and you have set some breakpoints in your IDE or added printlns.
  9. is a singleton

In order to keep this behavior in a pure CopyOnWrite, the copy would have to itself be a FunkyMap, not a TreeMap. One could use reflection, but there goes all the speed bonus. To their credit, the Atlassian Utilities do provide for the possibility of subclasses. It's a small bit of work and may not be suitable for all cases.



Friday, December 3, 2010

CopyOnWrite Wrappers

Java has a CopyOnWriteArrayList is a very useful class in java.util.concurrent which, when iterating, takes "snapshots" of the underlying array, and therefore never throws a ConcurrentModificationException.  It is highly effective and efficient when you need to preclude interference among concurrent threads.  But it only implements two types of collection behavior: ArrayList and a CopyOnWriteArraySet.  Unlike many of the other java.util.Collections goodies, like Collections.unmodifiableList, it is not implemented as a wrapper class, but as an actual class.  So, if you want behavior other than an ArrayList or ArraySet, or you need to protect one of your own special List implementations, you are out of luck.  Nor is there any version for a Map, i.e. there is no CopyOnWriteMap.

So I implemented it.  The files are on my wiki page at http://flyingspaniel.wikidot.com/cow.

COWListWrapper wraps any List, providing "CopyOnWrite" behavior for iteration.
COWMapWrapper wraps a Map, providing "CopyOnWrite" behavior when you iterate using keySet(), entrySet(), or values().

There are some hopefully useful utility classes that they use:

UnmodifiableCollection is an actual class, not a wrapper utility, to wrap an underlying Collection with unmodifiable behavior.  It includes static inner classes, UnmodifiableCollection.List and UnmodifiableCollection.Set, that add List or Set behavior.

UnmodifiableIterator is a class implementing ListIterator that works efficiently with the above classes and prohibits modification during iteration.

Enjoy!

A couple of notes.  For any CopyOnWrite implementation, even the "official" Java ones, you only get the iteration safety if you actually use the iterator( ).  If you use the old fashioned "C-style" integer loop

for (int i=0; i < myList.size(); i++) {
  doSomethingWith( myList.get(i));
}

and a separate thread is modifying myList, all bets are off.  You should use the new style:

for (E element:myList) {
  doSomethingWith(element);
}

The standard Java CopyOnWrite classes know exactly what collection behavior (ArrayList or ArraySet)  to implement and, at construction, they make a safe copy any incoming data, and thereafter are completely divorced from the original collection.  My wrapper classes do not, and cannot, know all possible List or Map behaviors.  They use the passed in (wrapped) List or Map for storage and most of the implementation.  For example, a call to add(Object o) is passed to the underlying object, and it implements the behavior and storage.  In other words, my wrappers do not make a "safe" copy of the data at construction time, and remain linked to the underlying collection.  You should not use the wrapped collection directly - all calls and modifications must be done through the the wrapper.  The "safe" copy of data is done only for iteration.

Recently added:
  UnmodifiableCollection.toString() implemented (useful in the unit test)
  COWWrapperTest  unit tests