Wednesday, December 30, 2009

Answering the Clojure vs. Ruby & Scala Challenge

Lau Jensen has done an interesting comparison of the relatively "new" languages Clojure, Ruby & Scala, comparing them for lines of code and speed on a sample "interview" problem, counting unique words in a directory.  Below in my Java version of the solution.  (You can also find it here)  One could squeeze a few more LOC out of it, but, IMO, that's not the point. The point is to show that a relatively normal and "readable" Java program, using no foreign libraries, does the same job in a reasonably similar amount of lines of code to the others.  I think you could find some Apache or similar code to do some of this work.

The original code is verbose "thanks" to all the generics definitions.


Note added after first posting:

You can save considerable time (about 15%) by pre-compiling the regular expression.  Add a line

Pattern splitOnWhitespace = Pattern.compile("[ \t]");

then change line 18, the line.split() code to

for (String s : splitOnWhitespace.split(line))

I've done this below, plus fixed one other issue about the definition of a "word".

I haven't does a full comparison timing, but hopefully Lau will run one shortly.

import java.io.*;
import java.util.*;

public class WordCounter {

 public static void main(String[] args) throws IOException {
  Long timeStart = System.currentTimeMillis();
  File rootDir = new File("C:/temp/20_newsgroups");
  CountingSet counter = new CountingSet();
  Pattern wordPattern = Pattern.compile("\\w+");

  for (File groupDirectory : rootDir.listFiles())
   if (groupDirectory.isDirectory())
    for (File f : groupDirectory.listFiles()) {
     if (f.isFile()) {
      BufferedReader reader = new BufferedReader(new FileReader(f));
      String line;
      while ((line = reader.readLine()) != null) {
       Matcher matcher = wordPattern.matcher(line);
       while (matcher.find())
        counter.add(matcher.group());
      }
      reader.close();
     }
    }

   PrintWriter pw = new PrintWriter("C:/temp/counts-alphabetical-java.txt");
   for (Map.Entry<String, Integer> me : counter.entrySet())
    pw.println(me.getKey() + " : " + me.getValue());
   pw.close();
   pw = new PrintWriter("C:/temp/counts-decreasing-java.txt");
   spewInverted(counter, pw);
   pw.close();
   System.out.println("Finished in " + 0.001
            * (System.currentTimeMillis() - timeStart) + " seconds");
   }

 static void spewInverted(Map<String, Integer> in, PrintWriter pw) {
  ArrayList<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(
            in.entrySet());
  Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
   public int compare(Map.Entry<String, Integer> o1,
Map.Entry<String, Integer> o2) {
    return o2.getValue().compareTo(o1.getValue());
   }
  });

  for (Map.Entry<String, Integer> entry : list)
   pw.println(entry.getKey() + " : " + entry.getValue());
 }

}

class CountingSet extends TreeMap<String, Integer> {
 void add(String s) {
  Integer i = get(s);
  put(s, (i== null) ? Integer.valueOf(1) : Integer.valueOf(i+1));
 }
}


Monday, December 28, 2009

Java Event Handling part 3: Double Dispatch

In the past two blogs, I presented an alternative route to Java event handling. Here's a final twist allowing one to easily listen to multiple types of events. It works best when you are in control of the events, i.e., they are "business logic" under your control. The technique is called "double dispatch".

1) Define a base class for your events. In my example code, you are running a email spam business, so there is a SpamEvent, with several (internal) subclasses.

2) Define an interface, e.g. SpamListener, with calls to listen to every subclass of event. Note that the names of the calls need not be distinctive, since the signature of the argument varies. For example, the name of the call could still be simply handleEvent().

3) Each subclass of the base event should implement a method
public void doubleDispatch(SpamListener listener) {
listener.handleEvent(this);
}

4) You need to implement a genericized listener (as in my previous two posts) for the base class, which calls doubleDispatch, e.g.


public void handleEvent(SpamEvent event) {
event.doubleDispatch(theSpamListener);
}

What happens is that this one listener for all of your events, tells the event to double dispatch. The event knows what type it is, and calls the appropriate method in theSpamListener.

Example source code (with JUnit tests) for this is available at my wiki:

http://flyingspaniel.wikidot.com/java-event-handling

Friday, December 25, 2009

Swing Event Handling Part 2

Last post discussed some possible improvements to Swing event handling. With a few drawbacks.

The first was that the code presented could only easily fire one type of event. Which is easy to fix. In Java, or any similar language, whenever you want to expand from "one type of thing" to "multiple types", consider a Map. We define Broadcasters, which is essentially a

HashMap<Class<? extends EventObject>, Broadcaster>



The three main methods are straightforward:
public synchronized void addListener(Class eventClass, EventListener l) {
Broadcaster b = classMap.get(eventClass);
if (b != null)
b.addListener(l);
else if (l instanceof GenericListener) {  // automatically handle these
b = new Broadcaster.Generic();
b.addListener(l);
classMap.put(eventClass, b);
}
else throw new RuntimeException("no broadcaster for " + eventClass);
}

public void fireEvent(EventObject event) {
Broadcaster b = classMap.get(event.getClass());
if (b != null)
b.fireEvent(event);
}


public void removeListener(Class eventClass, EventListener l) {
classMap.get(eventClass).removeListener(l);
}



One problem solved. As for having multiple listeners in a class, that's trickier. A class cannot declare that it implements GenericListener, and GenericListener because, after erasure, these are the same. One solution is to use an inner class. I'm not a big fan of anonymous inner classes, but a lot of programmers (and example code) use these for their listener code anyway, so this is no different or no worse than much current practice. e.g.





addListener(Foo.class, new GenericListener() {

@Override
public void handleEvent(Foo event) {
// do something here
}

}

Monday, December 14, 2009

Swing Event Handling - Possible Improvements

Though the concept is certainly fine, I've never been thrilled with Java's implementation of event firing and handling.  You define an event class, which is fine and useful, different events may carry different data, this makes perfect sense.  But, by convention at least, you need also define an interface for the EventListener, e.g.

public interface FooListener extends EventListener {
    public void someRandomHelpfulName(FooEvent e);
}

Now, this has some advantages, as we will see later, but it has definite drawbacks.  The first one is that in many shops the name of the method call is something like  "handleFooEvent", which is completely redundant.

This isn't all that big a deal, but Swing's EventListenerList, which many use to hold the listeners, is a lousy design. 

Sunday, December 6, 2009

Commitment Issues

Maik Jablonski, who blogs about db4o, correctly commented that the first version of the ZK - db4o code had no commits().  I have updated the code to do so.

One part is very easy : IOODB adds a method,

     public void commit();

and the two implementations, Db4oDB and NeoDatisDB implement that with calls to commit.

A little bit more work is needed in GenericController.

After the three calls that modify the database (add, update and delete), we add a check.  Here is add:

public void onClick$add() {      
  if (current != null) {
    current = (T) Objects.clone(current);
    getDB().store(current);
    if (shouldCommit(current))
      getDB().commit();
  }
}

I then added a new method, shouldCommit().  For now, it just returns true.  In a "real" app, this might vary.  You could keep a thread-safe counter and commit every N, you could keep a timer and commit based on that.  You could commit after certain "important" classes, e.g. stored in a HashSet

protected boolean shouldCommit(T current) {
  return (important.contains(current.getClass());
}

You could have a separate thread working with a timer to trigger commits, in which case shouldCommit() would just return false.

In a "really real" app, there'd be a CommitmentStrategy, a CommitmentFactory, and some Spring framework to inject some commitment into the relationship.  :-)

I will upload the new files.

BTW, I'm using Alex Gorbatchev's syntax highlighter.  Thanks Alex!