XPath and dom4j

One of my favorite features of dom4j is its integrated XPath support. If you are using dom4j for your domain model, the XPath support gives you a built-in query engine for your model.

XPath support is defined in the Node interface. (dom4j Elements extend Node, so every XML element has XPath support.) The three methods I use most are valueOf(), selectNodes() and selectSingleNode().

valueOf() gives you the text value resulting from an XPath expression. For example, to find the value of the type attribute of a node, evaluate node.valueOf("@type"). The XPath expression can be as complex as you like. For example you can find the type of the second Item child by evaluating node.valueOf("Item[2]/@type").

That example gives a taste of the power of XPath. It is really amazing how much you can pack into one line of code. (You can learn more about XPath with this tutorial.)

Use selectNodes() or selectSingleNode() when you want the actual nodes for further processing. Both methods take an XPath expression and return all, or the first, node matching the expression.

Again you can drill down as deeply as you want, and qualify the names at each level. Here is an expression I used to find all the Option elements that were children of input questions in my model:

/LearningObject/LearningPoint/Slide[@type="question"]/Question[@typecode="input"]/Options/Option

These three XPath methods are so simple, I use them for almost all access to the data, even when more specialized methods are available. For example I write node.valueOf("text()") instead of node.getText(). This way I only have to remember three accessor names, and it's very easy to change the access, for example if I need node.valueOf("Text/text()") I just have to edit the XPath expression.

OK, enough! XPath rocks! dom4j rocks! Check it out!