Wednesday, July 23, 2014

Groovy-Like XML for Java. Simple and Sane.

Parsing and navigating through XML in Java is a pain.  The org.w3c.dom.* classes are numerous, messy, and "old style", with no Collections, no Generics, no varargs.  XPath helps a lot with the navigation part, but is still a bit complex and messy.

Groovy, with XMLParser and XMLSlurper and their associated classes, makes this amazingly, dramatically easier.  Simple and Sane.  For example, Making Java Groovy Chapter 2 has an example to parse the Google geocoder XML data to retrieve latitude and longitude.  Below is the essentials of the code.  The full code, which is not much longer, is on GitHub here.

String url = 'http://maps.google.com/maps/api/geocode/xml?' + somemore...
def response = new XmlSlurper().parse(url)
stadium.latitude = response.result[0].geometry.location.lat.toDouble()
stadium.longitude = response.result[0].geometry.location.lng.toDouble()
The parsing is trivial, and navigating to the data (location.lat or location.lng) is also simple, following the familiar dot notation.

Can you do something anything like this in pure Java?  Not quite.  So I wrote a small library, xen, to mimic much of how Groovy does things.  The full Geocoder.java code is here, snippet below:


String url = BASE + URLEncoder.encode(address);
Xen response = new XenParser().parse(url);

Option 1: XPath slash style, 1 based indices
latLng[0] = response.toDouble("result[1]/geometry/location/lat");
latLng[1] = response.one("result[1]/geometry/location/lng").toDouble();

Option 2: Groovy dot style, 0 based indices
latLng[0] = response.toDouble(".result[0].geometry.location.lat");
latLng[1] = response.one(".result[0].geometry.location.lng").toDouble();
Pretty close, eh?

The main difference is that we can't use the dot notation directly from an object, but we can use a very similar slash notation based upon XPath syntax. If you use XPath notation, one major difference from Groovy is that array indices in W3C XPath are 1-based, not 0-based.  Therefore note that we access the 1st element of result, not the 0th.  However, if the "path" starts with a . and a letter, as in the final example, the path is treated as a Groovy / "dot notation" style, with 0-based indices.

So, if you want to greatly simplify parsing and navigating through XML, and/or you love how Groovy does things, please check out my (very beta!) xen library which allows you to do it in Java.  Currently it is compiled vs. Java 6 but I think it should be fine in Java 5.  So if you need to support some Android device, or can't or don't want to integrate Groovy into your Java projects, this could be very useful.

Xen library
JavaDocs
README

The README discusses various design decisions, particularly, how my design converged upon many aspects of the Groovy design.   More discussion will appear in later posts.  And, be warned, this is still a very early version, 0.0.2, so there are probably bugs, some mistakes, and upcoming API changes.


Node for Java Programmers

At a recent BayNode Meetup, I gave a 15 minute presentation on "Node for Java Programmers".  Mainly notes on common things I did wrong coming from the Java world, and ideas or idioms to deal with them.

I got some good feedback and positive responses, and recently edited the presentation.

Here is a link to it.   (on Google Docs).