Why I love Python
Weblog of Kent S Johnson

There are lots of reasons why I like Python. I've collected some of them here.

Amazingly easy collections

Python has excellent built-in support for lists and dictionaries. They are supported in the language syntax itself.

Here is an example. Suppose you need to take a list of triples of strings in the form

[ language code, item id, localized string ]

and build a two level Map from language code => item id => data triple. I'll show this in Java and Python. Both examples include a simple test driver which prints:

This is a test
C'est un essai
no data
Another test
no data

Java version

Here is the Java version. It is 56 lines and 1797 characters. The functional part of the code (excluding main()) is 38 lines.

import java.util.*;

public class Test {

  private Map _map = new HashMap();
  
  public Test(String[][] data) {
      // Convert the input data to a two-level Map from language code => course ID => locale data
      for (int i = 0; i < data.length; i++) {
              String[] itemData = data[i];
              
          String lang = itemData[0];
          
          Map langMap = (Map)_map.get(lang);
          if (langMap == null) {
              langMap = new HashMap();
              _map.put(lang, langMap);
          }
          
          String id = itemData[1];
          langMap.put(id, itemData);
      }
  }
  
  public String lookup(String lang, String id, String defaultData) {
      Map langMap = (Map)_map.get(lang);
      if (langMap == null) return defaultData;
      
      String[] itemData = (String[])langMap.get(id);
      if (itemData == null) return defaultData;
      
      String title = itemData[2];
      if (title == null || title.length() == 0)
          return defaultData;

      return title;
  }
  
  
  public static void main(String[] args) {
      String[][] data = {
          { "en", "123", "This is a test" },
          { "fr", "123", "C'est un essai" },
          { "es", "123", "" },
          { "en", "345", "Another test" }
      };
      
      Test test = new Test(data);
      
      System.out.println(test.lookup("en", "123", "no data"));
      System.out.println(test.lookup("fr", "123", "no data"));
      System.out.println(test.lookup("es", "123", "no data"));
      System.out.println(test.lookup("en", "345", "no data"));
      System.out.println(test.lookup("fr", "345", "no data"));       
  }
}

Python version

And here is the Python version. It is 34 lines and 1036 characters. The functional part of the code (excluding main) is 17 lines. That is roughly 40% shorter than the Java version.

class Test:
  
  def __init__(self, data):
      # Convert the input data to a two-level Map from language code => course ID => locale data
      self._map = {}
      
      for itemData in data:
          lang, id = itemData[:2]
          self._map.setdefault(lang, {})[id] = itemData


  def lookup(self, lang, id, defaultData):
      itemData = self._map.get(lang, {}).get(id)
      if  not itemData:
          return defaultData
      
      return itemData[2] or defaultData
  
  
if __name__ == '__main__':
  data = [
          [ "en", "123", "This is a test" ],
          [ "fr", "123", "C'est un essai" ],
          [ "es", "123", "" ],
          [ "en", "345", "Another test" ]
      ]
      
  test = Test(data);

  print test.lookup("en", "123", "no data")
  print test.lookup("fr", "123", "no data")
  print test.lookup("es", "123", "no data")
  print test.lookup("en", "345", "no data")
  print test.lookup("fr", "345", "no data")

I know which version I prefer!

Easy nesting of data structures

Python makes it very easy to build complex data structures. One place this is handy is with data-driven programming.

For Meccano I wrote a simple walk-by-rule engine. It walks the tree of domain data and applies callbacks at indicated points. The walk is driven from a tree structure that can be quite large and deeply nested. (I have written about the rule engine before.)

Here is a simple example using some of the same techniques. As you read the example, imagine that the list of rules might be hundreds of lines long and deeply nested. Later on I will indicate some other ways the example might be extended.

Assume we are given a dictionary and we are to print its contents in a nested form where the nesting and order of keys in the output is given by the location of dictionary keys in a structure built from nested lists.

The essential idea of the problem is to use a staticly defined nested list to drive the formatting.

Python version

The Python version is short and sweet (14 lines, 349 chars). The data structures are defined easily and the output generation is simple:

data = { 'a':1, 'b':2, 'c':3, 'd':4, 'e':5 }

formatData = [ 'a', [ 'b', [ 'd', 'e' ], 'c' ] ]


def output(format, indent):
  for item in format:
      if type(item) == type([]):
          output(item, indent+2)
      else:
          val = data[item]
          print '%*s%s: %s' % (indent, ' ', item, val )

output(formatData, 0)

The output is:

a: 1
 b: 2
   d: 4
   e: 5
 c: 3

Java version

The Java version is long and ugly. It is 44 lines and 1127 chars - over three times the size of the Python version! The Map is defined in code. The nested list needs extra (Object[]) casts that greatly reduce readability. The code is much more verbose; this is always the case with Java collection code vs Python:

import java.util.HashMap;
import java.util.Map;

public class Structure {

   static Map data = new HashMap();
   
   static {
       data.put("a", new Integer(1));
       data.put("b", new Integer(2));
       data.put("c", new Integer(3));
       data.put("d", new Integer(4));
       data.put("e", new Integer(5));
   }

   static Object[] struct = {
       "a", 
       new Object[]{
           "b", new Object[]{
               "d", "e"
           },
           "c"
       }
   };

   public static void output(Object[] format, int indent) {
       for (int i = 0; i < format.length; i++) {
           Object item = format[i];
           if (item instanceof Object[]) {
               output((Object[])item, indent+2);
           }
           else {
               Integer val = (Integer)data.get(item);
               for (int j=0; j<indent; j++)
                   System.out.print(' ');
               System.out.println(item + ": " + val);
           }
       }
   }
   
   public static void main(String[] argv) {
       output(struct, 0);
   }
}

Reading data from a file

Now suppose you want to put the configuration data in a file that can be changed at runtime and reloaded as needed.

In Python, all you have to do is move the data structure definitions into a separate Python module. Client code imports the data module and reloads it before each use to re-read the source if it has changed.

In Java, typically the configuration data will be put into a text (non-code) format. XML works well for storing hierarchical data so it would be an obvious choice. Now, you have to define an XML format to hold the data and write code to load and parse the data.

So a hidden benefit of Python is that it includes a parser with the runtime. The parser can read text files and create native collections. This is a huge plus for Python!

More Python benefits

Imagine that part of the nested structure is a class or function name. In Python, the configuration module is code so it can define classes and functions that are referenced directly in the data. Or you can import the module that defines the class or function, then put a reference to it in the data.

In Java, you would typically use separate compiled modules to define the classes and reflection to reference them. The use of reflection further complicates the configuration parser or the client code.

What if parts of the data are repeated? In Python, it's no problem! A repeated section of the configuration can be defined separately and included into the main structure by reference. With an XML representation, the data would likely be repeated in multiple locations in the file.

This all works

I'm not just making this up for the sake of argument - these are all techniques I have used in production code. Python data structures are wonderfully flexible and easy to use!

String handling

Sometimes it's the little things that make my day. For example, string handling and tuple unpacking.

It's so easy to work with strings in Python! Here is the code to drop the last character from a string:

s = s[:-1]

In Java that becomes

s = s.substring(0, s.length()-1);

How about splitting a string on the last instance of '/', where at least one / is assumed to be present?

i = s.rindex('/')
prefix = s[:i]
lastElement = s[i+1:]

In Java it is about the same, though much more verbose:

int i = s.lastIndexOf('/');
String prefix = s.substring(0, i);
String lastElement = s.substring(i+1);

In Python, I can easily make this into a function that returns both prefix and lastElement. With tuple-unpacking, the client code can assign the values to two varibles:

def splitPath(path):
    i = path.rindex('/')
    return path[:i], path[i+1:]

# client code
prefix, last = splitPath(aPath)

There is no reasonable equivalent in Java.

I am working on a small database application. It has a utility class that queries the database, returning the result as a list of lists, one for each row. Client code can easily unpack the row lists into individial variables. This is what it looks like:

data = self.dba.queryAsList('select topicid, parentid, groupid from topic')
for topicid, parentid, groupid in data:
  # process the data

Sweet!

Note: I know, the best way to split a file path in Python is to use os.path.split(). The point of the post is to show how easy it is to work with strings; maybe I'll talk about the standard libraries another time. And in fact, the original problem domain was not file paths.

First class functions

Python's first-class functions make life easier in so many ways. A Java programmer will say, "I can do the same thing with interfaces and anonymous inner classes," but it's just not the same.

The most obvious use is for callbacks of any kind. The simplest example is a function that runs a command. In Python, the command can be passed directly into the runner as a function. For example:

# The run function just calls the callback
def run(cmd):
    cmd()

# A simple command
def myCommand():
    print 'myCommand'

# Run the command
run(myCommand)  # prints 'myCommand'

You can pass an instance method to runner using a bound method:

class RunnerClient:
    def useRunner(self):
        run(self.command)
    
    def command(self):
        print 'RunnerClient.command'

RunnerClient().useRunner()  # prints 'RunnerClient.command'

This can easily be extended to pass any number of arguments to the command given to runner, and to return any number of results:

# The run function
def run(cmd, *args):
    return cmd(*args)

# A simple command
def myCommand(x, y):
    print 'x = %d, y = %d' % (x, y)
    return x+10, y+10

# Run the command
z, w = run(myCommand, 10, 20)   # prints 'x = 10, y = 20'
print z, w  # prints '20 30'

Alternatively, the command arguments can be bound to a function of no arguments using lambda:

# The run function
def run(cmd):
    return cmd()

# A simple command
def myCommand(x, y):
    print 'x = %d, y = %d' % (x, y)
    return x+10, y+10

# Run the command
z, w = run(lambda : myCommand(10, 20))

That's easy! What does it look like in Java? Let's start with the simplest case above. We need an interface for the command, a Runner class and an instantiation of the command. I made Command and Runner inner classes to keep everything in one module; normally they would be top-level classes. Here is the code:

public class RunnerClient {
    private interface Command {
        public void execute();
    }

    private static class Runner {
        public static void run(Command cmd) {
            cmd.execute();
        }
    }

    public void useRunner() {
        Command cmd = new Command() {
            public void execute() {
                System.out.println("myCommand");
            }
        };

        Runner.run(cmd);
    }

    public static void main(String[] args) {
        new RunnerClient().useRunner();
    }
}

Yikes! That's nasty. So much noise! There is the visual noise of the extra punctuation and type declarations, and the conceptual noise of class declarations.

If you want the callback to be a member function, it's not too much different. The anonymous inner class can forward to the method of the main class:

public class RunnerClient {
    private interface Command {
        public void execute();
    }

    private static class Runner {
        public static void run(Command cmd) {
            cmd.execute();
        }
    }

    public void useRunner() {
        Command cmd = new Command() {
            public void execute() {
                command();
            }
        };

        Runner.run(cmd);
    }

    public void command() {
        System.out.println("RunnerClient.command()");
    }

    public static void main(String[] args) {
        new RunnerClient().useRunner();
    }
}

To pass arguments to the command...I'm not sure I want to think about that. Probably the simplest solution is to make a specific Command implementation to hold the arguments to the command. You need a different helper class for each command signature. Maybe there is a way to do it with introspection. Any way you cut it, it's going to be painful.

last change 2005-05-12 08:09:36

May 2005
MoTuWeThFrSaSu
       1
2 3 4 5 6 7 8
9101112131415
16171819202122
23242526272829
3031     
Apr
2005
 Jun
2005

Let me count the ways...

XML-Image Letterimage

BlogRoll

© 2005, Kent Johnson