Saturday, December 8, 2007

Running fast code in Python

As I have suspected, the numbers I got for execution time of the Haskell fibonacci program (look at my previous posts) are wrong, in the sense that I compiled the program without optimisation, as most normal users will do in real applications. That is why I feel that I should redeem myself and give the numbers for the program with optimisation turned on. So, when compiling the program using

ghc -O2 fib.hs -o fib -no-recomp (thanks to Don Stewart)

execution time is 0.64s (mean time of five runs). This time is much closer to the one reported on the Haskell blog (0.48s). Ok, so my hat goes off to Haskell---it really is faster than Python some 50 times.
But, being a Python fan, I couldn't help but feel that something is wrong here. Is Haskell really that good? It's fast, easy to parallelize, pure functional and it probably cooks for you after a hard day at work. So, why should I use Python?
Well, the first thing that comes to my mind is the fact that ghc generated a 925kB exe file from a 1kB file that computes fibonacci numbers. In comparison, gcc (3.4.2) generated 16 kB for a fibonacci program that ran 0.75s (compiled using -O2). While it is still impresive that Haskell was faster than C, I cannot help but think that it is just too much to have an exe file almost 1000 times larger than the source. While this is certainly a drawback of Haskell, it still wasn't enough to return my faith in Python. Then, another thing came to my mind---if Haskell can use optimisations, why can't Python? I mean, if I used -O2 in ghc, why can't I do somehting similar in Python?
Of course, running Python with -OO gave no real speedup---it shook off about one second in execution time. Instead of 26s, I got 25s for the non-parallel case, and instead of 18s, I got 17s for the non-parallel case. But, the great thing about Python are its packages. I remember a friend of mine telling me about a thing called Psyco. It was some kind of a package that optimised execution of Python code, so I decided to give it a shot. After downloading and installing Psyco (, I just added two lines to my source (I give the full source here):

from time import time
import psyco

def fib(n):
if n == 0 or n == 1:
return n
return fib(n-1) + fib(n-2)

if __name__ == "__main__":
start_time = time()
for i in range(36):
print "n=%d => %d" % (i, fib(i))

print "Time elapsed: %f" % (time() - start_time)

So, the lines I added were import psyco, and psyco.full(). I ran this thing and voila---the time of execution was 1.93s (mean time of five runs). Wow! That was all I could think. Remember, this was the non-parallel case, so it went from 25 seconds to 1.9s. Or, in terms of Haskell speed, Haskell now beat Python by on 3x. And with no extra effort (ok, adding two lines to the source) and no parallelization.
Unfortunately, psyco didn't help the parallel version---it still took 18s. But still, I think this is pretty amazing and it definitely brought back faith in Python. So, it's three times slower, but the synatx is much cleaner and it doesn't generate a thousand times bigger executable. I'm sticking to Python :)
(Note: some parts of this post are biased and down right offensive to Haskell fans (just kidding about the last one:) ). My intention is not to start a flame war or anything like that, I just felt that someone should stand up and take Python's side:). Seriously though, if you have any objections on my methodology, I would like to hear from you.)


Will McGugan said...

I wrote a faster version, just for the heck of it.

Kyle said...

Most of the bloat is Haskell's run time system. It's a fixed cost, and for a larger executable it won't be nearly so large a code explosion.


Bobberino said...

There is something funky about your origianl code. I ran the following in straight Python without Psyco and computed the same sequence in about 4 microseconds.
import time

def fib(n): # write Fibonacci series up to n
"""Print a Fibonacci series up to n."""
a, b = 0, 1
while b < n:
print b
a, b = b, a+b

startTime = time.clock()


print 'elapsed time: ', (time.clock() - startTime)

But your original code runs in about 45 seconds!! on my machine. That's a 5 order of magnitude difference! I would be interested in finding out why yours takes so long. I haven't had much of a chance to look at it in detail yet, but it computes the first 20 or so values fairly fast, but it then really bogs down. Have you traced the the run time memory or the CPu usage?

Bobberino said...

Okay, I've been out of school for a while, but its starting to come back. The problem, I think, is with your algorithm. You are doing a recursive call on yourself twice. I really don't know how Python handles that. If you would refactor that to do a single recursive call, I'm thinking that you will get significant improvements. Unless there is some funny bug for this kind of strange algorithm in Python, I don't think there is any problem with Python's speed. My simple rather standard Pythonesque solution is nothing fancy and it completed the same Fibonacci series in 3 microseconds. You used Psyco and only cut it in half to about 18 seconds! You may as well use a 300 baud modem (yes,I am that old). Even Will using other techniques dramatically and rather impressively improved the run time of that algorithm, but still only got around 3 milliseconds, which is still fairly slow, and it looks like it is slowing down exponentially. You might want to try 300 Fibonacci numbers and see how long that takes. Hey, maybe I'll try that as well. Anybody fressh out school can enlighten us a bit on second order recurrence relations?

Omaha Python Users Group said...

I compiled the simplistic version of Fib w/ shedskin and Time elapsed: 0.782071

see my comments @
and shedskin @

Filox said...

Ok, I see that my last post made a lot of confusion. I will try to clarify things a bit. So, the recursive fibonacci algorithm (the one I used) has an exponential time complexity. For a nice explanation on why is this so, see
So, why would I use something so slow when there are clearly better (faster) algorithms? The whole point of this and one of my previous posts was not to show off different algorithms for computing the fibonacci numbers (there is a well-known formula to compute the nth number in the sequence directyl, without having to compute the first n-1 numbers), but to see how different languages handle these intensive recursive calls. As you could read in my posts, C and Haskell obviously handle it well (they make code optimisations and probably have less function call overhead), while Python does not. So, using psyco was an attempt to make Python code run fast, without changing the complexity of the algorithm, that is, see how much can we speed up an existing Python program without changing its source. Imagine that we were not talking about fibonacci, but about some huge project that ran very slow. Would you really rewrite the whole thing or would you simply add two lines to the source? I hope I made things a little clearer...

Bobberino said...

Thanks for clearing that up. I did come in a bit late and the perspective that I saw was how "fast" these times were. The article looks interesting. One thing I noticed when I ran your algorithm with n quite large was that it very quickly took all the memory in my system, but then hardly used any CPU. So it isn't even processor time where Python bogs down. It must be in keeping track and allocating space for the recursive calls like you said.

akaihola said...

I assume the compiled Haskell program runs stand-alone? To be fair in the size comparison, you should either compile the Python program to a stand-alone executable or count the shared Python installation as part of the "compiled size". On my Ubuntu system that alone takes around 10 megabytes.