Updated: 4/8/05; 10:41:54 AM.
Feeding The Snake
Thoughts on Python, Science, Culture, and Life
        

Tuesday, March 29, 2005

What does your spam say about you?
Am I the only one who's fascinated by spam? I'm not so stupid to actually click on any links in a message, and I certainly wouldn't ever buy anything from one of the ads. However, I started looking through my spam folders a few weeks ago amid fears that legitimate messages were getting filed there, and I've continued reading the subjects long since those fears have faded. I guess I feel that spam holds the secret to either our society's id or my own secrets, but I can never decide which.

I like the way that spam adapts to current trends so quickly. They jumped on Ebay and PayPal right away, with messages pretending to tell me that I need to log into my account. It's slimy, but I respect the fact that the spammers are so current. And, hey, maybe they're trying to tell me something when they send me all those ads about losing weight and taking viagra.
4:35:20 PM    comment []


GMPY (Finally, a new FTS Post)
My apologies for all of the old posts showing up in FTS. I changed to RadioUserland from bzero in a somewhat bumpy transition.

I really like the GMPY module, which consists of a set of python bindings for the Gnu Multiprecision Library. Maybe it's just a sign that my life is boring, but I really enjoy playing around with arbitrary precision math, and computing long representations of e and pi. In any case, to help evangelize the cause, I put together the following guide Getting Started and Having Fun with GMPY. This will ultimately become part of the GMPY documentation, but, in the meantime, I would welcome any feedback.
3:36:55 PM    comment []


Tagging versus Sorting
Note: this is an archive of a post from the old bzero version of Feeding the Snake that I wanted to include with the new Radiouserland incarnation of the blog.


I've been playing around with del.icio.us for a couple of days now. I had commented several times to myself that I didn't really get the site, and then, in the course of one day, I spent hours inserting bookmarks and tagging them in various ways. It was fun and theraputic in an odd way.

I realize that everyone and their brother has already blogged about tagging vs sorting. The basic line is that traditionally we've sorted items into bins, but sorting itself can be limited. If in my closet I have a drawer for socks, and another drawer for athletic gear, where do I put my athletic socks?

That's a silly example, but a real problem I have is with sorting scientific papers, of which I have thousands. When I used to put these in file cabinets I had a wierd scheme where, if I had more than a few papers by an author I would putthem in a folder under that author's name, and otherwise I would put them in a folder under the subject name. Not a great idea, but one that has kinda' worked for a few years.

Now that I'm slowly moving into the century of the fruitbat, I'mtrying to update my system a bit more. Almost all of my papers are in PDF files on my laptop hard drive. The above author/subject folder sort has persisted, but with soft-links I can now put links in the subject folders to the files that are in author folders. However, thesystem is brittle, since I frequently just drop a file into anauthor's directory, and then forget to put the paper in all of thesubject files that it would belong in. Or, I create a new subject andforget to track down all of the files in various author folders thatit would correspond to.

A system like the del.icio.us tagging would work ideally. I've noticeda proliferation of collection managers like tellico for cataloging books, CDs, etc. It would be nice to use/build a PDF-specific collection manager, that could optionally abstract information fromthe PDF file itself. If anyone knows of anything along these lines, please leave a comment.


9:44:07 AM    comment []

Mathematica Style Notebook in Python
Note: this is an archive of an old post from the bzero version of Feeding the Snake that I wanted to include with the current RadioUserland incarnation

Now that matplotlib has solvedmy plotting issues for the while, I can turn my attention to the Python shell...

My favorite Mathematica feature is the notebook. For those who haven't used it, the notebook is an interactive shell, but allows plotting to be inlined and saved with the shell session. This feature makes it really nice for data analysis, since you can save the steps that generated the graphics with the graphics themselves.

I would love something like this in Python. So would a lot of people.The topic pops up particularly often on the IPython mailing lists.

The closest I've seen to this is the Python plugin to the TeXmacs package. I've written some extensions to the package to allow it to inline other types of graphics (I was using biggles at the time) and to do so more automatically. It all worked OK, but it wasn't really what I'm looking for.

I've started thinking about how to implement what I *am* looking for. It would have to be something along the lines of a Python shell running in a wxpython window. Wouldbe nice to be able to use IPython optionally as the shell, although I'm not such an IPython bigot that I feel I would have to require this.

Python actually makes it pretty easy to write your own shell; I posted my little solution of thisin a PyCookbook recipe.

Matplotlib has already done alarge part of the requisite work in the way they have built theirpylab library. One would just have to inline the images after theywere computed. Doesn't sound hard, but, then again, I haven't done ityet.


9:39:11 AM    comment []

Python Plotting and Matplotlib
Note: this is an archive of an old post from the bzero version of Feeding the Snake that I
wanted to include with the new RadioUserland version of the blog.

There seem to be almost too many options for python plotting now. Here are the ones that I know about:I'm a scientist, and so being able to display my data is a big dealto me, which means that I try to stay on top of all of this,but I'm having a harder and harder time. I normally try to predictwhich package has the largest mindshare, and go with that one, sinceI figure that package will be easiest for my collaborators to use.

The first module I used was the gnuplot module, since I had already used gnuplot itself for along time, and was familiar with the commands. In many ways it stillis a great module, but I stopped using it because it wouldn't plotdirectly from a Python interactive session. Since gnuplot is astand-alone plotting package, it has a great many features, many ofwhich are available from gnuplot-py.

I switched to biggles, mostly because I could run it from Emacs, which is a real feature for me,since I really like to be able to edit a python script in emacs andthen hit C-c C-c and execute it. Biggles does nearly everything Iwant, but sometimes the labels come out looking a bit amateurish.

I first spotted PyX on the Python module for texmacs. PyX appears to focusmostly on postscript output, and, because of that, has a great manyunique features. I always feel like I should use PyX, but I'venever actually broken down and used it.

Also in the category of things I feel like I should use but don't arethe Scipy plotting packages plt and gplt. I love Scipy, I use Scipy, I've given a talk at a SciPymeeting, I've met many of the people at Enthought. But I've neverused the plotting packages.

Which brings me to the hot kid right now, Matplotlib. Another one I'venever used, but I'm extremely tempted by it. They seem to havemindshare to burn, and the package appears to be very capable.Certainly the plots like this are pretty damn impressive! Later: tried toinstalled matplotlib and failed. From CVS and from the point release. However, the program is cool enough that I'll probably try again. More later.


I have seen the future of Python plotting, and its name is matplotlib...

I played around a little bit more with matplotlib today. Never was able to build from source on OS X, but founda link to a precompiled binary that installed the matplotliblibraries into my version of MacPython. Nice.

On first inspection, the code is very impressive, by far the mostprofessional plotting package I've used from Python. The ability touse different backends, although it complicates the build proceedure,is genius.

I have to insert into the examples either

import matplotlibmatplotlib.use('Agg')
to simply render to a file, or
import matplotlibmatplotlib.use('WXAgg')
to use wxPython to render the images to the screen. Most everythingelse seems to work out of the box.

My only other complaint is that since the program uses wxPython torender to the screen, I can't run from an emacs window, since emacsuses 'python' and not the required 'pythonw'. If anyone knows of asimple hack around this, I'd appreciate hearing about it. Probably if I bothered to read python.el something would suggest itself to me.

However, to be honest, I don't care. Matplotlib generates publication quality output. Biggles, my previous favorite, no matter how nice and how convenient, did not. I think I have a new favorite.


9:36:43 AM    comment []

Optimizing Python
Note: this is an archive of a post from the old bzero version of Feeding the Snake that I wanted to  include with the new RadioUserland version.
I was flipping through the Wikipedia article on Python earlier today. Really good article. I guess I shouldn't be surprised, since I've seen so many good articles there, but I guess I always amamazed to see other people get python.

At the bottom of many wikipedia articles is a set of external links.On that list I found this article on Python Performance Tips. It's good, complete, and certainly mirrors much of my own experience with thelanguage.

I spent a lot of time reading a book by Stephan Goedecker called Performance Optimization of Numerically Intensive Code. I spent a lot of time wondering whether similar analysis could bemade of Python. And spent too much time trying to optimize dotproducts in Python without any brilliant insights. The above web pageis the best reference I've found.


9:31:52 AM    comment []

Compiling and Linking to Math Libraries on OS X
If you're interested in squeezing speed out of your Numeric Python applications, you
will have to build Numpy using optimized math libraries.

Mac OS X comes with a set of math libries build in, called Veclib.

The secret code for linking to these libraries is
-Wl,-framework -Wl,veclib
(which I have an incredibly hard time remembering).

The High Performance Computing Tools forOS X website is great. I just downloaded the version of gfortranfrom that site, and got it running on my computer. The one point thatI didn't find on the website is that you need to make sure that yourversion of cctools is fairly recent (mine wasn't). I found a link to cctools-528 that installed without incident, and I'm now running gfortran without problem.


9:28:24 AM    comment []

Vimes Release
Note: this is an older post from the bzero based version of Feeding the Snake that I'm
including in the new RadioUserland version of the blog


I released a new program called Vimes this week over on Sourceforge. Vimes (the Visual Interface toMaterials Simulations) is a program for displaying and controlingatomistic simulations programs. Vimes is written in Python, uses OpenGL, to draw the moleculesand wxwidgets for the widgets.

There's a nice article over at Infoworld by John Udell about the UnsungHeroes of Open Source. I would nominate Mike Fletcher, the prime movingforce behind Python OpenGL. Mike has given me tons of free help withPython/OpenGL programming simply because he's a nice guy.


9:21:55 AM    comment []

Simple Python Programming
Note: the following is a repeat of an old post from my bzero-based weblog that I'm moving over to RadioUserland.

I'm going to talk about how my Python programming has changedin the years that I've been using the language. I have my own littlePython package that I call Pistol, which supposedly stands for PythonScientific Toolkit. It contains little applications that I use in myday-to-day work (computational scientist at a National Laboratory).Some of the scripts have evolved over several years, and I'm alwaysamused looking back at how my use of the language features haschanged. It's not, and never will be, a full-fledged project hosted atSourceforge, just a set of tools that I install on every machine thatI plan on using.

My absolute, all-time favorite toy in Matlab is the way they can makea graphical view of a matrix using the commands pcolor or spy. Thesecommands give you an instant view of where the big values in a matrixare, which can be important if you're developing algorithms to exploitthat structure.

Shortly after moving to Python, I realized that it was easier tosimply write my own little versions of these in Python to view Numericarrays than to constantly keep converting the matrices to/from Matlabformat.

When someone comes from a static language like Fortran to a dynamiclanguage like Python, their initial forays into the language have asort of a "deer-in-the-headlight" look to them, like there are simplytoo many shiny knobs to play with. My initial attempts at thesescripts are big, have classes to drive them, and use Tkinter to renderthe matrices.

I've rewritten the scripts several times, and there are still tons ofthings wrong with them, but here's my current attempt...

def spy_matrix_pil(A,fname='tmp.png',cutoff=0.1,do_outline=0,height=300,width=300):
import Image,ImageDraw
img = Image.new("RGB",(width,height),(255,255,255))
draw = ImageDraw.Draw(img)
n,m = A.shape
if n>width or m>height:
raise "Rectangle too big %d %d %d %d" % (n,m,width,height)
for i in range(n):
xmin = width*i/float(n)
xmax = width*(i+1)/float(n)
for j in range(m):
ymin = height*j/float(m)
ymax = height*(j+1)/float(m)
if abs(A[i,j]) > cutoff:
if do_outline:
draw.rectangle((xmin,ymin,xmax,ymax),fill=(0,0,255),
outline=(0,0,0))
else:
draw.rectangle((xmin,ymin,xmax,ymax),fill=(0,0,255))
img.save(fname)
return
Short and sweet. I started using PIL in favor of Tkinter, whichsimplified the structure a great deal. There are lots of things thatI'd like to change (most notably the fact that the size of the matrixshouldn't be related to the number of pixels in the image), but thisperforms a lot of work for me.
9:19:59 AM    comment []

© Copyright 2005 Rick Muller.
 
March 2005
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Feb   Apr


Click here to visit the Radio UserLand website.

Subscribe to "Feeding The Snake" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.