Wednesday, December 30, 2009

Answering the Clojure vs. Ruby & Scala Challenge

Lau Jensen has done an interesting comparison of the relatively "new" languages Clojure, Ruby & Scala, comparing them for lines of code and speed on a sample "interview" problem, counting unique words in a directory.  Below in my Java version of the solution.  (You can also find it here)  One could squeeze a few more LOC out of it, but, IMO, that's not the point. The point is to show that a relatively normal and "readable" Java program, using no foreign libraries, does the same job in a reasonably similar amount of lines of code to the others.  I think you could find some Apache or similar code to do some of this work.

The original code is verbose "thanks" to all the generics definitions.


Note added after first posting:

You can save considerable time (about 15%) by pre-compiling the regular expression.  Add a line

Pattern splitOnWhitespace = Pattern.compile("[ \t]");

then change line 18, the line.split() code to

for (String s : splitOnWhitespace.split(line))

I've done this below, plus fixed one other issue about the definition of a "word".

I haven't does a full comparison timing, but hopefully Lau will run one shortly.

import java.io.*;
import java.util.*;

public class WordCounter {

 public static void main(String[] args) throws IOException {
  Long timeStart = System.currentTimeMillis();
  File rootDir = new File("C:/temp/20_newsgroups");
  CountingSet counter = new CountingSet();
  Pattern wordPattern = Pattern.compile("\\w+");

  for (File groupDirectory : rootDir.listFiles())
   if (groupDirectory.isDirectory())
    for (File f : groupDirectory.listFiles()) {
     if (f.isFile()) {
      BufferedReader reader = new BufferedReader(new FileReader(f));
      String line;
      while ((line = reader.readLine()) != null) {
       Matcher matcher = wordPattern.matcher(line);
       while (matcher.find())
        counter.add(matcher.group());
      }
      reader.close();
     }
    }

   PrintWriter pw = new PrintWriter("C:/temp/counts-alphabetical-java.txt");
   for (Map.Entry<String, Integer> me : counter.entrySet())
    pw.println(me.getKey() + " : " + me.getValue());
   pw.close();
   pw = new PrintWriter("C:/temp/counts-decreasing-java.txt");
   spewInverted(counter, pw);
   pw.close();
   System.out.println("Finished in " + 0.001
            * (System.currentTimeMillis() - timeStart) + " seconds");
   }

 static void spewInverted(Map<String, Integer> in, PrintWriter pw) {
  ArrayList<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(
            in.entrySet());
  Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
   public int compare(Map.Entry<String, Integer> o1,
Map.Entry<String, Integer> o2) {
    return o2.getValue().compareTo(o1.getValue());
   }
  });

  for (Map.Entry<String, Integer> entry : list)
   pw.println(entry.getKey() + " : " + entry.getValue());
 }

}

class CountingSet extends TreeMap<String, Integer> {
 void add(String s) {
  Integer i = get(s);
  put(s, (i== null) ? Integer.valueOf(1) : Integer.valueOf(i+1));
 }
}


Monday, December 28, 2009

Java Event Handling part 3: Double Dispatch

In the past two blogs, I presented an alternative route to Java event handling. Here's a final twist allowing one to easily listen to multiple types of events. It works best when you are in control of the events, i.e., they are "business logic" under your control. The technique is called "double dispatch".

1) Define a base class for your events. In my example code, you are running a email spam business, so there is a SpamEvent, with several (internal) subclasses.

2) Define an interface, e.g. SpamListener, with calls to listen to every subclass of event. Note that the names of the calls need not be distinctive, since the signature of the argument varies. For example, the name of the call could still be simply handleEvent().

3) Each subclass of the base event should implement a method
public void doubleDispatch(SpamListener listener) {
listener.handleEvent(this);
}

4) You need to implement a genericized listener (as in my previous two posts) for the base class, which calls doubleDispatch, e.g.


public void handleEvent(SpamEvent event) {
event.doubleDispatch(theSpamListener);
}

What happens is that this one listener for all of your events, tells the event to double dispatch. The event knows what type it is, and calls the appropriate method in theSpamListener.

Example source code (with JUnit tests) for this is available at my wiki:

http://flyingspaniel.wikidot.com/java-event-handling

Friday, December 25, 2009

Swing Event Handling Part 2

Last post discussed some possible improvements to Swing event handling. With a few drawbacks.

The first was that the code presented could only easily fire one type of event. Which is easy to fix. In Java, or any similar language, whenever you want to expand from "one type of thing" to "multiple types", consider a Map. We define Broadcasters, which is essentially a

HashMap<Class<? extends EventObject>, Broadcaster>



The three main methods are straightforward:
public synchronized void addListener(Class eventClass, EventListener l) {
Broadcaster b = classMap.get(eventClass);
if (b != null)
b.addListener(l);
else if (l instanceof GenericListener) {  // automatically handle these
b = new Broadcaster.Generic();
b.addListener(l);
classMap.put(eventClass, b);
}
else throw new RuntimeException("no broadcaster for " + eventClass);
}

public void fireEvent(EventObject event) {
Broadcaster b = classMap.get(event.getClass());
if (b != null)
b.fireEvent(event);
}


public void removeListener(Class eventClass, EventListener l) {
classMap.get(eventClass).removeListener(l);
}



One problem solved. As for having multiple listeners in a class, that's trickier. A class cannot declare that it implements GenericListener, and GenericListener because, after erasure, these are the same. One solution is to use an inner class. I'm not a big fan of anonymous inner classes, but a lot of programmers (and example code) use these for their listener code anyway, so this is no different or no worse than much current practice. e.g.





addListener(Foo.class, new GenericListener() {

@Override
public void handleEvent(Foo event) {
// do something here
}

}

Monday, December 14, 2009

Swing Event Handling - Possible Improvements

Though the concept is certainly fine, I've never been thrilled with Java's implementation of event firing and handling.  You define an event class, which is fine and useful, different events may carry different data, this makes perfect sense.  But, by convention at least, you need also define an interface for the EventListener, e.g.

public interface FooListener extends EventListener {
    public void someRandomHelpfulName(FooEvent e);
}

Now, this has some advantages, as we will see later, but it has definite drawbacks.  The first one is that in many shops the name of the method call is something like  "handleFooEvent", which is completely redundant.

This isn't all that big a deal, but Swing's EventListenerList, which many use to hold the listeners, is a lousy design. 

Sunday, December 6, 2009

Commitment Issues

Maik Jablonski, who blogs about db4o, correctly commented that the first version of the ZK - db4o code had no commits().  I have updated the code to do so.

One part is very easy : IOODB adds a method,

     public void commit();

and the two implementations, Db4oDB and NeoDatisDB implement that with calls to commit.

A little bit more work is needed in GenericController.

After the three calls that modify the database (add, update and delete), we add a check.  Here is add:

public void onClick$add() {      
  if (current != null) {
    current = (T) Objects.clone(current);
    getDB().store(current);
    if (shouldCommit(current))
      getDB().commit();
  }
}

I then added a new method, shouldCommit().  For now, it just returns true.  In a "real" app, this might vary.  You could keep a thread-safe counter and commit every N, you could keep a timer and commit based on that.  You could commit after certain "important" classes, e.g. stored in a HashSet

protected boolean shouldCommit(T current) {
  return (important.contains(current.getClass());
}

You could have a separate thread working with a timer to trigger commits, in which case shouldCommit() would just return false.

In a "really real" app, there'd be a CommitmentStrategy, a CommitmentFactory, and some Spring framework to inject some commitment into the relationship.  :-)

I will upload the new files.

BTW, I'm using Alex Gorbatchev's syntax highlighter.  Thanks Alex!

Saturday, November 28, 2009

Using ZK with an OODB such as db4o

A recent job opening I applied for asked for GWT or ZK experience.  I had played very briefly with GWT when it first appeared, but had no idea what ZK was.  So I looked at it.  It is another AJAX / Rich Internet Application framework, similar to GWT in that you still write most of your underlying code in Java.  The one big difference from GWT is that ZK is server-centric, running most of the application on the server, whereas GWT is client-centric, running most of the app on the client.  Naturally, both sides tout themselves as "better".

One of the ZK demos is a simple TODO list, where the entries get stored into a relational database.  (they use HSQLDB).  You can download the ZK demo project file here.  Look inside to get the .war file and the source code.  They define a simple java-bean Event.java, plus a couple of simple, relatively straightforward XML files.  I chose to work from their version with a separate java file & class EventController.java, instead of binding all this logic into a an XML page. EventController.java looks reasonable - there's a couple of fields required to interact with the specific GUI, otherwise it's very clean..  But then you look at the DAO, EventDAO.java.  It's big.

The "problem" is in the persistence layer, binding the TodoEvents to the database.  Now, there's nothing wrong with EventDAO.java.  It's just a real shame that for such a nice, lightweight, concise framework like ZK, the DAO code requires over 150 lines of boilerplate Java code to handle the JDBC calls.  For a single object.  For a more "real" app, with a few kinds of persistent objects, this code would grow accordingly.  At some point, one would consider using an ORM like Hibernate.  But that has it's own "activation energy" hurdle.  While reading the ZK forums, I came across a post that suggested the used of db4o.  Having used db4o a little myself, this idea really rang true.  I didn't see where anybody was doing this yet (see Jease for a project just starting that seems to be on this track).

Will using an OODB make your ZK code simpler?  Here's my first pass. And the answer is a resounding YES.  I am neither a ZK nor a db4o expert, so there are probably some improvements that can be made.

 TodoEvent.java has no change.

Their EventController is replaced with GenericController.java.  It uses generics to make it more expandable for future classes, though this may be getting too cute. Otherwise, it is similar to the original code.  It has the same slightly "kludgy" fields from the original, so as to link back to the web GUI

   private T current;
   Listbox box;


One difference is that the add() method must clone the object.  This is because OODBs (at least the two I'm using) do not differentiate between and update and an insert - they just use "store".  In technical terms, their definition of object identity differs from a RDB.  Fortunately, ZK priovides a utility class, Objects, to do most of the work.

   public void onClick$add() {     
      if (current != null) {
         current = (T) Objects.clone(current);
         getDB().store(current);
      }
   }



Just like the original EventController delegated persistence to an EventDAO, GenericController delegates to a "DAO", a simple interface IOODB.java.  I made two implementations, one for db4o, Db4oDB.java and another for NeoDatis, NeoDatisDB.java.  For this demo, the DB to use is defined in GenericController.DBHolder, but in a real life app this would be determined somewhere else.

The final java class is TodoEventController.java, which is a near trivial extension of GenericEventController.  It adds one method, getAllEvents().  This is because the index.zul file refers to "win$composer.allEvents".  ZK uses reflection to callback to the object, so we need to implement a method with a name it can find.  Obviously, one could edit the index.zul file instead.  The only change in index.zul is to refer to our new controller class:

window id="win" title="To do list" width="640px" border="normal" apply="com.flyingspaniel.zk.oodb.TodoEventController"

In conclusion, from this simple foundation, one can support most basic object types, on two different OODBs.  It's far easier than writing specific JDBC code, and less heavyweight than configuring Hibernate.  Give it a try!  And please let me know what you think and how this can be improved.  All the source code can be found at this Wikisite.

Thursday, November 26, 2009

Getting Started

This is my new blog on "good" software design.  One thing I've learned is that there are always at least two good ways to do something.  Hopefully, this blog will help you narrow them down.

In reviewing what lings and other blogs I should list here, I ran across a great quote from one of my strongly-opinionated heroes, Allen Holub.  It's about hiring, and matches my recent experience, where companies want somebody who knows some exact "hot" technology and accept no substitutes.
When you select for experience with specific technologies, for example, you reject extremely qualified people who could learn the required skills in a few days, while hiring incompetent programmers who have been using the technology incorrectly for years.