Guido is wrong! for loops, declarativeness and Charlie Mingus

10Feb07

Guido is wrong:
“Each jazz musician when he takes a horn in his hand- trumpet, bass, saxophone, drums-whatever instrument he plays-each soloist, that is, when he begins to ad lib on a given composition with a title and improvise a new creative melody, this man is taking the place of a composer. He is saying, “listen, I am going to give you a new complete idea with a new set of chord changes. I am going to give you a new melodic conception on a tune you are familiar with. I am a composer.” That’s what he is saying.
I have noticed that there are many kinds of composers in this so-called jazz. For instance, there are musicians who simply take rhythmic patterns and very spare notes-very limited invention melodically-and play in a soulful swinging way. […] I, myself, came to enjoy the players who didn’t only just swing but who invented new rhythmic patterns, along with new melodic concepts. And those people are: Art Tatum, Bud Powell, Max Roach, Sonny Rollins, Lester Young, Dizzy Gillespie and Charles Parker, who is the greatest genius of all to me because he changed the whole era around. […] I admire anyone who can come up with something original. But not originality alone, because there can be originality in stupidity, with no musical description of any emotion or any beauty the man has seen, or any kind of life he has lived. For instance, a man says he played with feeling. Now he can play with feeling and have no melodic concept at all. That’s often what happens in jazz: I have found very little value left after the average guy takes his first eight bars-not to mention two or three choruses, because then it just becomes repetition, riffs and patterns, instead of spontaneous creativity, I could never get Bird to play over two choruses. Now, kids play fifty thousand if you let them. Who is that good?

Today, things are at the other extreme. Everything is supposed to be invented, the guys never repeat anything at all and probably couldn’t. They don’t even write down their own tunes, they just make them up as they sit on the bandstand. It’s all right, I don’t question it. I know and hear what they are doing. But the validity remains to be seen -what comes, what is left, after you hear the melody and after you hear the solo. Unless you just want to hear the feeling, as they say. […] The music on this record is involved with my trying to say what the hell I am here for. And similar ideas. Another one is: let my children hear music -for God’s sake-they have had enough noise.

[…] I think the music on this record is serious in every sense. I say, let my children have music. I said it earlier. For God’s sake, rid this society of some of the noise so that those who have ears will be able to use them some place listening to good music. When I say good I don’t mean that today’s music is bad because it is loud. I mean the structures have paid no attention to the past history of music. Nothing is simple. It’s as if people came to Manhattan and acted like it was still full of trees and grass and Indians instead of concrete and tall buildings. It’s like a tailor cutting clothes without knowing the design, It’s like living in a vacuum and not paying attention to anything that came before you. [..]

Let my children have music! Let them hear live music. Not noise. My children! You do what you want with your own”

– Charles Mingus

Liner notes by Charles Mingus for the album “Let My Children Hear Music”
on Columbia Records, 1971.
My main “profession” is econometrician; I’m a programmer by calling, by hobby and by obsession. As many working econometricians nowadays (I’m actually a graduate student and don’t have a real job), I find myself doing a lot of programming in GNU R.
Econometrics, like much of “general” numerics, tries to fit in as much algorithmics as possible in the form of matrix multiplications. This is mainly a strategy to be able to understand the computability of stuff — since we don’t know much CS theory — but it also helps minimizing the godamn programming; after you’ve proven stuff on chalkboard, you can just type out the matrix operations in the order you want them executed. Unfortunately, a few of recent simulation-based methods — Monte Carlo/bootstrapping stuff — doesn’t quite fit in that mold. Thus, a good part of graduate-level econometrics courses now are ersatz programming lessons taken in front of a computer with GNU R.

Now, R’s for is really a foreach: it’s not FOR counter = start TO end STEP increment or FOR { counter=start; counter<=constraint; counter+=increment} but simply FOR counter IN array. As seen in the diagram above, foreach is a “brainier”, less “computery” construct than plain for; a lot of loops that would go

for i = 1 to length(X) {a<- X[i]; do_stuff_with(a); }

become simply

for a in X do_stuff_with(a)

which is probably what R’s designers thought when making for a foreach. This is, nevertheless, less of a boon than what it seems: when you foreach directly into the array, you lose track of an index variable — that is, you can do stuff with the array but it’s really hard to return another array or even handle its structure some more. Frankly, the only use I can see for this direct construct is show-type procedures or simple folds. So what do we do if we want to return another array?

One way to go about this is to write a snoc — a function that appends an element to the end of the array. An implementation of map would go like

map<- function (f, x) { y<-rep(NA, length(x)) for (a in x) snoc(y, f a) return(y);}

Of course, no one does this, for two basic reasons. First of all, it’s untrustworthy: if one of the computations fails, you end up with an Y array that’s shorter than the X array, where we’d like it to remain N/A. The second reason is shared with list-based maps: not all work in econometric is done with cross-section data! Very often we want observations to depend on the past which mainly indicates we need another data structure for time-series data. R has one, but it’s involved enough that most people still write code for vectors and matrices, not time-series; time-series reasoning is inherently recursive, and somehow R’s lexical model can’t handle that.

In the end, we end up using precious foreach as a simple for:

map<- function (f, x) { y<-rep(NA, length(x)) for (a in 1:length(x)) y[i]< f(x[i]) return(y);}

Clearly, foreach is almost useless in my field.
The foreach construct is nevertheless a step forward in the process of liberating the coder from bean-counting. First, you have spaghetti programming:
i=0; A: y[i]=dance_around_and_do_stuff(X[i]); i=i+1; if (i < 10) goto A keep_on_chugging
Then you have while, which attempts to emulate constraint reasoning even though it’s really not:

i=0; while i< 10 { y[i]=dance_around_and_do_stuff(X[i]); i=i+1; }
The next step is abstracting the iteration pattern

for (i=1:10) { y[i]=dance_around_and_do_stuff(X[i]);} keep_on_going_about

Finally, foreach does away with the useless counter variable

foreach (a in X) { y[i]=dance_around_and_do_stuff(a);} keep_on_going_about

Of course, the logical next step is to do away with the useless intermediate-storage variable

y = map dance_around_and_do_stuff X

Ahhh. Much better. Why are people complaining about this?
A spoonful of sugar makes the medicine go down:
y = [dance_around_and_do_stuff a | a < X]

is alternate syntax for the previous expression involving map. It’s more verbose — basically because it eschews the currying and reintroduces the intermediate variable. This form is quite fun, though, because it allows you to introduce some filters as well

y = [dance_around_and_do_stuff a | a < X, even a]

Yay! This is almost constraint programming! Well, it isn’t, really, but it’s much better pseudoconstraint programming than while loops. You can use this template to bruteforce your number theory and discrete math exercises right now in GHC.
Not all for-loops map (tee hee) directly into map constructs, though this is clearly the most common pattern. What can we do to keep the bean-counting down when it doesn’t?
Please complete the following sequences:
1. 1,2,3,4,5,…
2. 1,2,4,8,16…
3. 1,1,2,3,5,8…
4. 1, 4, 3, 12, 11, …
How are you supposed to know what goes next? You basically assume that the rules that have been holding … continue to hold. That’s how we come about most of the knowledge we possess: induction. Induction is unreliable, unscientific and yet is what we do all the time. Induction is responsible for great beauty, like a Charlie Mingus album, and great horrors, like becoming a racist after having been mugged a couple of times in the Bronx. I mean, go ask a cognitive scientist or seven whether this is more or less basic than for loops.

Now, to be able to reason clearly about induction, you need to make explicit what’s the rule you assume that will continue to hold. Functional languages often force you to reverse your thought process, in this sense, by having you define the new case in terms of the old cases — for example

bignum n = 2 * bignum (n-1) bignum 0 = 1

In languages with (n+k) patterns, we don’t even need this:

bignum 0 = 1 bignum (n+1) = 2 * bignum n

The for-loop version of this is

bignum = 1; for (i = 0; i≤ n; i++) { bignum = 2 * bignum }

Go ask a cognitive scientist or seven which of these is a “natural” thought process. Feh.
Now, recursion and induction are concepts that can express each other. Recursion is interesting because you do away with useless bean counting; optimizing compilers should turn this into whatever your computer can grok anyway. The fun starts when you start abstracting away the recursion patterns:
iterate 0 base f = base iterate n base f = xs ++ f (last xs) where xs = iterate (n-1) base f

Like map, this does away with a lot of repetitive loop coding:

bignum n = iterate n 0 (*2) smallernum n = iterate n 1 (\x->x+(x/2))

I really don’t understand your brain if you don’t find this a lot more productive than chugging away for loops
Now, get this: good compilers know theorems about recursion patterns. Take map fusion:

Say that, in this sad world where you write for loops by hand, you might want functions that double and square a bunch of numbers.

function double (list) { newlist = list.new for (i in 1:length(list) { newlist.append(2*list[i]);} return (newlist) }
function square (list) { newlist = list.new for (i in 1:length(list) { newlist.append(list[i]^2);} return (newlist) }

Can you compose these babies? Of course you can.

x = double (square (list))

This is what your computer does:

newlist = list.new for (i in 1:length(list) { newlist.append(list[i]^2);} newnewlist = list.new for (i in 1:length(list) { newnewlist.append(2*newlist[i]);}

But wait. It should know that this is equivalent to

function doublesquare (list) { newlist = list.new for (i in 1:length(list) { newlist.append(2*list[i]^2);} }

Well, with map fusion it does!
Well, this is too long already. I don’t really hope I’ve already persuaded you that if the revolution comes and your for-loops are forcibly taken away from you, you will have so much fun you won’t miss your handcuffs. But anyway, if you’ve read this far you at least have a good idea of why this econometrician is so passionate about functional programming.

Filed under: Uncategorized | 4 Comments

4 Responses to “Guido is wrong! for loops, declarativeness and Charlie Mingus”

Feed for this Entry Trackback Address

1 chriseidhof on February 10, 2007 said:

Nice! It’s really good to see the beaty of haskell even when not coding myself! I completely agree with you. I think a major advantage of Haskell over other languages is composability (except for when you’re in a Monad, ofcourse). It might be interesting to note that map, for example is really easy to make multi-threaded (the compiler could even automatically do that), whereas a compiler in a language with for-loops and side-effects is not able to make that decision. I think Haskell’s going to win big time on the ease of making applications run on multiple cores and/or processerors.

Reply
2 Not Guido on February 10, 2007 said:

What was your point? That comprehension lists aren’t a superset & always better than map (which I recall something about Guido saying, once upon a time)?

Reply
3 casey on February 11, 2007 said:

I see your point about the drawback of foreach losing the index counter.

That’s why I commonly wrap sequences in Python with the enumerate function. It maps every element in the sequence with a tuple containing (index, value).

Reply
4 ziddi on February 11, 2007 said:

We have the same avator! I think somebody owes me money…

Reply

	Personal Development… on Do-notation considered harmful
	Advice On Relationsh… on Do-notation considered harmful
	Ntc33 casino downloa… on Haskell, bondage-and-disciplin…
	Do notation consider… on Do-notation considered harmful
	Делать обозначение с… on Do-notation considered harmful

Guido is wrong! for loops, declarativeness and Charlie Mingus

4 Responses to “Guido is wrong! for loops, declarativeness and Charlie Mingus”

Leave a comment Cancel reply

Recent Posts

Dr. Syntaxfree

Licenses

Categories

del.icio.us recent links

Recent Comments

Meta

Guido is wrong! for loops, declarativeness and Charlie Mingus

Share this:

Related

4 Responses to “Guido is wrong! for loops, declarativeness and Charlie Mingus”

Leave a comment Cancel reply

Recent Posts

Dr. Syntaxfree

Licenses

Categories

del.icio.us recent links

Recent Comments

Meta