Python Rocks! and other rants - Agile Development

Weblog of Kent S Johnson

2005-12-03

How I write code

I tend to design from the bottom up - not exclusively, but in general I make small parts and combine them to make larger parts until I have something that does what I want. I refactor constantly as my understanding of a problem and the solution increase. This way I always have complete working code for some section of the problem. I rarely use stubs of any kind.

To start I will take some small section of the problem and think about what kind of data and operations on the data I need to solve it. For a very simple problem I might just write some functions to operate on the data. As I expand into larger parts of the problem I might find that several functions are operating on the same data and decide that they belong in a class. Or it might be clear from the start that I want to create a class around the data.

When one chunk is done to my satisfaction, I take on another, and another. I am creating building blocks, then using the building blocks to create larger blocks. Some of the blocks are classes, others are functions.

I write unit tests as I go, sometimes test-first, sometimes test-after, but always alternating writing code with writing tests so I know the code works and I have a safety net when I need to refactor or make other major changes.

At any time I may discover that I made a bad decision earlier, or realize that there is a better way to structure the code or data. Then I stop and rework until I am happy with what I have. The unit tests give me confidence that I haven's broken anything in the process. It's a very organic process, I sometimes think of it as growing a program.

(from a post to the Python-tutor list)

posted at 07:01:04 # comment [] trackback []

2005-08-21

Unit testing is an enabling technology

I'm reading Pragmatic Unit Testing by Andrew Hunt and David Thomas. I'm a big fan of unit testing and I think one of the biggest benefits is often overlooked.

The direct benefits of unit testing are clear and substantial: better tested, more reliable and often better designed code. The indirect benefit that is often neglected is unit tests as an enabling technology.

Have you ever looked at a pile of code and thought,

This is a mess, I should rewrite it
There's a lot of dead code here I should rip out
This would be easier to understand if I broke it up into smaller pieces
There's a lot of duplication here I should factor out
If I refactored this a little it would be a lot easier to add this new feature

How often is that thought followed by action, and how often by the thought, "I can't do that, there's too much risk of breaking something"?

This is where unit tests shine long after they are written. If the code in question has extensive tests, you can change it with confidence instead of fear, knowing that any breakage will be quickly discovered. Unit tests provide a safety net. There is a qualitative shift from coding in fear to coding with confidence.

Think of it - no longer do you have to live with messy code, dead code, monster methods, duplicated code, designs that don't reflect your needs! How sweet!

I recently finished making some modifications to some file-generation code. When I started there were three large modules that created three large files. The modules and the files were similar but they shared little code; the first module had been copied and modified to make the second two. There were no tests for any of it. My job was to add some options to all three generated files.

The first thing I did was set up a test framework. The generated files are XML so I created some reference files and used XmlUnit to compare the generated files with the references.

Next I started factoring out duplicated code into a common base class. I focused on the parts I had to change, with a goal of only having to make a change in one place, in the base class, rather than duplicating the change three times.

The result? The three modules are dramatically smaller. Most of the option testing is centralized in the base class. The subclasses contain a top-level driver function and many small callbacks that customize the output to their particular requirements. There is probably less code in the four modules I ended up with than there was in the original three because of all the duplication I removed.

Now that's a reason to write unit tests!

posted at 11:28:00 # comment [] trackback []

2005-04-17

Will unit testing slow you down?

I am trying to encourage unit testing and test-driven development at work. As far as I know, only a few developers here are really test-infected.

I have asked several developers, "Do you write unit tests? If not, why not?". The universal response to the second question is, "I don't have time." This strikes me as strange because in my experience writing unit tests helps me to work faster, not slower. Why is that?

The immediate, short-term benefit of unit testing is that I can quickly and easily run the code I am working on. I can generally run a unit test for a single module in one or two mouse clicks. Most of my tests run in a few seconds. So when I make a change to a piece of code, I can find out almost instantly whether it works or not. As an extra benefit, work is more fun because I am writing code instead of running manual tests, and I stay in the flow because I am thinking about code instead of switching gears to run tests.

An intangible benefit is the confidence I have that the code is working because the tests pass. It's a great feeling to write a module with a test suite and know without a doubt that the module is doing what I want it to.

A long-term benefit that affects development speed is the impact on code quality. This has several facets. First, the code is likely to have few defects because it is thoroughly tested. This cuts down on the time I must spend later in debugging and rework. Second, because unit tests impose some constraints on modularity and coupling, the code tends to be well-structured. Finally, with the safety net of the unit tests I have freedom to refactor as needed, so the structure remains appropriate to the job at hand. Each of these facets improves the readability, maintainability and reusability of the code, and that directly impacts productivity.

I don't want to gloss over the down side of unit testing. There are occasional speed bumps. Typically they come when I have to figure out how to test something new and I have to take the time to figure out how to write the test and integrate it into my build. This doesn't happen too often; usually I can reuse a similar setup from another part of the project, or a framework I already have.

The initial hump that keeps people from unit testing at all is one of these speed bumps. To get started, you do have to figure out how to use a test framework such as JUnit. I recommend asking for help - JUnit is really not that hard to use, and a simple example can go a long way.

Then there are the times when I make a change that breaks a test and I have to go fix it. For example I might change a data format, a test data set, or the signature of a function. This is annoying but easy to deal with.

Even with these drawbacks, I see unit testing as a huge win for productivity.

posted at 09:22:08 # comment [] trackback []

2004-11-23

Never answer the same question twice

When a user comes to me with a question about a program I have written, I like to do two things. First, answer the question. Second, change the program so the question won't come up again.

This is particularly appropriate with error messages. If a user has to ask me what an error message means, the message isn't doing it's job. I rewrite it so that next time someone sees it, they won't have to ask me about it. It helps to have just explained it to a live user.

posted at 07:50:08 # comment [] trackback []

2004-06-29

StranglerApplication

Martin Fowler has written an interesting note about how to replace a legacy application with something new. He suggests adding new functionality around the old application until you finally take over all the old functions. He calls this a StranglerApplication.

The most interesting part of the note is the pointer to a paper describing a specific project. Parts of this project are similar to what I did with Meccano:

Work closely with the users
Find ways to build confidence in the agile approach
Deliver real value as early as possible
Let the customers try out the new product and get them hooked on it

posted at 08:18:08 # comment [] trackback []

2004-05-27

Uncle Bob says, "Go well, not fast"

Robert Martin argues eloquently for taking the time to make your code right the first time. If you focus only on speed then you end up dragging so much weight around that speed is impossible. If you do it right you can keep moving. Recommended reading. Highly recommended practice!

posted at 18:28:16 # comment [] trackback []

2004-05-25

Unit testing a complex procedure

I am working on a unit test for a complicated, multi-step procedure. Conceptually it is something like this:

def complicatedStuff(self):

  self.step1()

  self.step2()

  self.step3()

  # etc...

Ideally I would like to write tests for each step:

def test_step1(self):

  # self.obj is the object under test

  # set up to test step 1...

  self.obj.step1()

  # check that step 1 was successful...

def test_step2(self):

  # etc...

The problem is that the setup for each step is complex. The best way to set up to test step3() is to do step1() and step2(). So I have settled for a single test method that has the same structure as complicatedStuff():

def test_complicatedStuff(self):

  # set up for step1...

  self.obj.step1()

  # check that step 1 was successful...

  self.obj.step2()

  # check that step 2 was successful...

  self.obj.step3()

  # etc...

This smells. It is a clear violation of Don't Repeat Yourself - the structure of complicatedStuff() is duplicated. As a result it is fragile. If the structure of complicatedStuff() changes, test_complicatedStuff() has to change the same way. On the other hand, it works, which is worth a lot!

posted at 08:43:44 # comment [] trackback []

2004-05-24

Database unit testing is HARD

I am working on a project that makes some brain-twisting changes to a database. It is the first database work I have done in a while, the changes are a bit tricky, and the consequences of failure grim (breaking large production databases in use by many thousands of customers daily), so I am writing unit tests for everything using DbUnit.

DbUnit has one feature that I really like - the setUp() method of a test case can initialize the database to a known state. This feature alone is enough to adopt DbUnit.

What really stands out is how slowly I am making progress. There are many reasons for this - I am learning the problem domain as I go along, the problem has a number of wrinkles to it. But part it is that writing the unit tests is just plain hard.

Often unit testing is pretty simple - pass a few parameters to a function, check the result. Repeat for a few different sets of parameters. Sometimes there is a structure to be set up or checked. It's easy to work in small bites.

For this project, each test case requires the database to be set up. With DbUnit, this means creating an XML file that reflects the desired state of the database tables. These files are hard to read and hard to create when the table has foreign keys to another table. In my case, one of the tables represents a tree structure so it is essentially a list of parent-child relationships.

So first I have to figure out what will make a good test case. Then I create the XML file, either by hand editing or by somehow getting the database into the desired state and dumping it to XML. Finally I can write the actual test. This usually involves writing some queries to figure out if the database is in the correct state.

Then I can actually write the code to make the test pass. No wonder it is going slowly!

By the way I am writing the project in Jython. DbUnit works well with Jython because you don't have to subclass a DbUnit test case class - you can use DbUnit using independent objects and static assertions. I am writing the test cases with Python's unittest module and calling DbUnit as a library.

posted at 19:10:56 # comment [] trackback []

2004-05-03

When to design, when to code?

Thinking about when to design and when to write code leads to an illuminating distinction between different development styles.

A (sadly) popular style of development is code-and-fix programming. In this style, the goal is to do the minimum required to get something that appears to work. At its worst this is coding without design, at any rate without thinking seriously about design or architecture or long-term viability. It leads to unreadable, unmaintainable code, cascading defects and many other problems.

Big Design Up Front (BDUF) is a response to this style. BDUF attempts to figure out the solution in the abstract before starting to implement it. This is design without coding. This approach has many problems as well. It is hard to figure out a good design without the hands-on knowledge you get from writing code. It is brittle and unresponsive to changing requirements. It takes considerable effort to create and maintain design documents.

The agile approach is a middle way. It disdains both sloppy hacking and excessive design. Agile programmers think about what might work and try it. They refactor when they have a better idea or when the requirements change. They keep growing the design.

Agile development is sometimes feared because without up-front design it must be "just hacking". If you aren't designing up front, with formal process and documents, you must not be designing at all. This fear is born from the reaction to code-and-fix programming and thinking that code-and-fix is the only alternative to BDUF. This is a false dichotomy and a false perception of agile development.

Design and coding work best when taken together. If you try to think your way through the design without writing code you lack the on-the-ground knowledge you get from coding. If you just start writing code with the sole goal of getting something that seems to work, you are hacking in the worst sense of the word. But if you consider each addition to the code carefully and keep the code clean at all times you will end up with a thing of beauty - well-designed code that is superbly suited to the task at hand.

Note: Ned Batchelder's blog about the similarity between diamond cutting and the decisions that must be made while designing a software system was the spark that led me to this idea.

posted at 08:58:40 # comment [] trackback []

2004-04-24

Agile Prophecies of Dr Seuss

Everything I need to know I learned from The Cat in the Hat.

Just read it :-)

posted at 22:27:12 # comment [] trackback []

Don't Repeat Yourself

Don't Repeat Yourself and its special case Once and Only Once are two of the most important principles of good development. Read this essay for more.

posted at 14:18:40 # comment [] trackback []

2004-04-19

Continuous Design

I have written before about Growing a design. Key to the success of this technique is keeping your code clean using principles such as Don't Repeat Yourself and You Aren't Going to Need It.

In this article, Jim Shore chronicles his experience with this process. I particularly like the sidebar Design Goals in Continuous Design which summarizes much of what makes this technique work.

posted at 08:41:36 # comment [] trackback []

Inversion of Control Frameworks

Whenever you hide an implementation class behind an interface, you have the problem of instantiating the concrete instances of the interface and giving them to the client code.

There are several ways to do this:

The client code can instatiate the instance directly
The instance can be stored in a global resource such as a singleton or registry
The code that instantiates the client can also create the instance and pass it to the client
An instance can be created from a global property using reflection

Each of these techniques has disadvantages:

If the client creates the instance then you can't substitute a different implementation without changing the client, and the benefit of using the interface is reduced.
Using a global registry, singleton or property makes the client depend on the global facility which makes testing and reuse more difficult.
Reflection is complicated when the instance had dependencies of its own, for example it needs configuration data or depends on other interfaces.

A solution to this problem that is gaining popularity is to use a framework with support for Inversion of Control (or, as Martin Fowler calls it, Dependency Injection). With this technique, client code can be written with no dependencies on global resources. The framework takes care of initializing the required instances and providing them on demand.

Martin Fowler has an article that explains the technique. Two frameworks that use this technique are Spring and PicoContainer.

posted at 08:28:48 # comment [] trackback []

2004-04-07

Build Integrity In

Chapter 6 of Lean Software Development is Build Integrity In. This is a subject near and dear to me because I am passionate about quality.

Quality is free!

If you keep your codebase clean and expressive it will be supple, it will support your need for change as the project moves forward.

If you let the codebase get crufty and brittle, your progress will slow to a crawl as change becomes harder and bugs keep cropping up. It won't happen in the first release, maybe not even the second, but it will happen.

I have seen both kind of projects and the clean ones are a whole lot more fun after they have been through a few release cycles.

posted at 15:35:28 # comment [] trackback []

Agile development

Python Rocks! and other rants - Agile DevelopmentWeblog of Kent S Johnson