Python Rocks! and other rants 29.4.2004
Weblog of Kent S Johnson

2004-04-29

Why I love Python 2

Python makes it very easy to build complex data structures. One place this is handy is with data-driven programming.

For Meccano I wrote a simple walk-by-rule engine. It walks the tree of domain data and applies callbacks at indicated points. The walk is driven from a tree structure that can be quite large and deeply nested. (I have written about the rule engine before.)

Here is a simple example using some of the same techniques. As you read the example, imagine that the list of rules might be hundreds of lines long and deeply nested. Later on I will indicate some other ways the example might be extended.

Assume we are given a dictionary and we are to print its contents in a nested form where the nesting and order of keys in the output is given by the location of dictionary keys in a structure built from nested lists.

The essential idea of the problem is to use a staticly defined nested list to drive the formatting.

Python version

The Python version is short and sweet (14 lines, 349 chars). The data structures are defined easily and the output generation is simple:

data = { 'a':1, 'b':2, 'c':3, 'd':4, 'e':5 }

formatData = [ 'a', [ 'b', [ 'd', 'e' ], 'c' ] ]


def output(format, indent):
  for item in format:
      if type(item) == type([]):
          output(item, indent+2)
      else:
          val = data[item]
          print '%*s%s: %s' % (indent, ' ', item, val )

output(formatData, 0)

The output is:

a: 1
 b: 2
   d: 4
   e: 5
 c: 3

Java version

The Java version is long and ugly. It is 44 lines and 1127 chars - over three times the size of the Python version! The Map is defined in code. The nested list needs extra (Object[]) casts that greatly reduce readability. The code is much more verbose; this is always the case with Java collection code vs Python:

import java.util.HashMap;
import java.util.Map;

public class Structure {

   static Map data = new HashMap();
   
   static {
       data.put("a", new Integer(1));
       data.put("b", new Integer(2));
       data.put("c", new Integer(3));
       data.put("d", new Integer(4));
       data.put("e", new Integer(5));
   }

   static Object[] struct = {
       "a", 
       new Object[]{
           "b", new Object[]{
               "d", "e"
           },
           "c"
       }
   };

   public static void output(Object[] format, int indent) {
       for (int i = 0; i < format.length; i++) {
           Object item = format[i];
           if (item instanceof Object[]) {
               output((Object[])item, indent+2);
           }
           else {
               Integer val = (Integer)data.get(item);
               for (int j=0; j<indent; j++)
                   System.out.print(' ');
               System.out.println(item + ": " + val);
           }
       }
   }
   
   public static void main(String[] argv) {
       output(struct, 0);
   }
}

Reading data from a file

Now suppose you want to put the configuration data in a file that can be changed at runtime and reloaded as needed.

In Python, all you have to do is move the data structure definitions into a separate Python module. Client code imports the data module and reloads it before each use to re-read the source if it has changed.

In Java, typically the configuration data will be put into a text (non-code) format. XML works well for storing hierarchical data so it would be an obvious choice. Now, you have to define an XML format to hold the data and write code to load and parse the data.

So a hidden benefit of Python is that it includes a parser with the runtime. The parser can read text files and create native collections. This is a huge plus for Python!

More Python benefits

Imagine that part of the nested structure is a class or function name. In Python, the configuration module is code so it can define classes and functions that are referenced directly in the data. Or you can import the module that defines the class or function, then put a reference to it in the data.

In Java, you would typically use separate compiled modules to define the classes and reflection to reference them. The use of reflection further complicates the configuration parser or the client code.

What if parts of the data are repeated? In Python, it's no problem! A repeated section of the configuration can be defined separately and included into the main structure by reference. With an XML representation, the data would likely be repeated in multiple locations in the file.

This all works

I'm not just making this up for the sake of argument - these are all techniques I have used in production code. Python data structures are wonderfully flexible and easy to use!

posted at 20:10:40    #    comment []    trackback []
April 2004
MoTuWeThFrSaSu
    1 2 3 4
5 6 7 8 91011
12131415161718
19202122232425
2627282930  
Mar
2004
 May
2004

Comments about life, the universe and Python, from the imagination of Kent S Johnson.

XML-Image Letterimage

BlogRoll

© 2004, Kent Johnson