Comments on Daily rant: Holy shmoly Haskell doesn't smoke Python away (that much)

@stephanThe whole point was to compare the same pr...

2008-04-01T05:43:00.000-07:00

@stephan
The whole point was to compare the same program (approach) in different languages... That is why I used the naive Fibonacci instead of some iterative approach. I could have easily used the formula for the n-th Fibonacci number directly if I wanted fast code

But with python you wouldn't go for a recursive ap...

2008-04-01T05:17:00.000-07:00

But with python you wouldn't go for a recursive approach like in haskell thats just crazy and creates a senseless big stack.

You either do:

def fib(n):
if n < 2:
return n
else:
a = 0
b = 1
x = 0
for i in xrange(2,n):
x = a + b
a = b
b = x
return x

for i in xrange(36):
print "n=%d => %d" % (i, fib(i))

Or you would cache:

fib_dict = {}

def fib(n):
if n > 1:
try:
x = fib_dict[n]
return x
except:
x = fib(n-1) + fib(n-2)
fib_dict[n] = x
return x
else:
return n

for i in range(36):
print "n=%d => %d" % (i, fib(i))

Both versions will easily compete or even outperform the parallel Haskell version.

@filoxThanks for the clarification.The main proble...

2007-12-06T23:15:00.000-08:00

@filox

Thanks for the clarification.

The main problem to address though is that you're testing for speed, yet didn't compile the Haskell with the optimiser on.
Without optimisations, I get similar numbers to you:

$ ghc A.hs -o A
$ time ./A
...
n=35 => 9227465
./A 3.76s user 0.09s system 99% cpu 3.861 total

However, compiling with -O2, as all production Haskell does:

$ ghc -O2 A.hs -o B -no-recomp
$ ./B
...
n=35 => 9227465
./B 0.48s user 0.01s system 99% cpu 0.492 total

We get the same numbers in my original post: still a 50x speedup over serial Python, and 36x over the parallelised version.

If we then go ahead an parallelise the Haskell code at the top level -- here's one parallel top level:

main = mapM_ putStr $
parMap rnf (\i -> printf "n=%d => %d\n" i (fib i)) [0..35]

We get good multicore speedups again:

$ time ./A +RTS -N2
...
n=35 => 9227465
./A +RTS -N2 0.28s user 0.00s system 158% cpu 0.174 total

I'd argue that the result is much the same as I originally stated: 50x speedups over Python, much more so than the 6x you report, and remaining easier to parallelise.

Don:you are right, this is just top level parallel...

2007-12-04T07:48:00.000-08:00

Don:
you are right, this is just top level parallelization. It is different from the Haskell program, but my goal was just to show that some parallelization is also possible in Python... I will try to write a different program that will parallelize in the similar way to that in Haskell when I get some free time.
adept:
I used ghc 6.8.1, there is an error in the code that was later corrected. Again, when I get some time I'll run this corrected code.

And regarding "pseq: not in scope" - you have to h...

2007-12-03T12:51:00.000-08:00

And regarding "pseq: not in scope" - you have to have ghc 6.8.1 to use that

And you parallelise the code differently, I think?...

2007-12-03T08:30:00.000-08:00

And you parallelise the code differently, I think? (Spawning one thread for each top level fib call?).

Can you clarify if each recursive fib call is parallelised, or just the initial call for some value of N?

Doing just the top level parallelisation is much easier, but is a different program to the one presented. (Good to see its possible in python though!).

That code is compiled without optimisations:ghc -O...

2007-12-03T08:18:00.000-08:00

That code is compiled without optimisations:

ghc -O2

Or:

ghc -O2 -optc-O

The followup post explored the use of `par` and `pseq` some more, and corrected
a bug in the original parallelisation strategy.

2007-12-03T08:11:00.000-08:00

This comment has been removed by the author.