### Reddit discussion considered productive

Wow! I didn’t expect ~~the spanish inquisition~~ getting on the front-page of programming.reddit.com (I didn’t even submit my post to the site) and sparkling such a lively discussion there. I would just reply there, but I somehow feel it’s better to leave a more “permanent” record — even if for my own future reference — of my thoughts on the issues raised.

Vintermann’s interpretation of my point was this:

The issue for me is that from the syntax, you could believe

`greet = do {`

putStr ("You're "++ getLine ++"years old! \n");

putStr ("Congratulations! \n");

}

was valid haskell, which it isn’t

This is specially true since `<-` as destructive update/assignment is used in many pseudocode languages in introductory books — particularly Knuth’s TAOCP, as well as some languages (like The Other Language I Use, GNU R).

Ricercar adds

(…) [M]onads are not limited to one interpretation, whereas “do notation” is strongly tied to the imperative interpretation.

This is much like “Lawyer Language” that is confusing precisely because it looks somewhat like natural language, when it really isn’t.

Well, yes, I couldn’t agree more. Maybe I’m unconsciously selecting the favorable comments — the one that further illustrate my point. But let’s examine pjdelport’s reply:

That’s hardly fair; programming language constructs don’t carry the same expectation of consistency that natural languages do. Both their syntax and semantics are wildly variable and arbitrarily incompatible, for even the most common things.

There’s two problems with this remark. In first place, ambiguity in natural language is in fact expected and discounted — often large chunks of natural language are confirmation requests and clarifications — while programming languages are mostly expected to be “logical” and consistent. In second place, that `head` in Beta means an iterator while in Haskell it means a cons-list “destructor” just means languages are different — much like in english “cola” evokes “soda” and in portuguese it means “glue”.

The problem here, as stated by ricercar, is that do-notation is by design an analogy to imperative (I should rather say, destructive-update sequential-evaluation) programming, and yet the analogy breaks the moment you step out of the IO monad — for example, working with Parsec. Maybe the key distinction being missed here is the one between an analogy and an abstraction; this is particularly clear since this thread chunk is already in reply to a comment where pjdelport remarks that failed assumption breakage is structural to programming language abstractions, which they just aren’t. (Ever seen assumption breakage about closures?)

Monads are an abstraction, and do-notation is just syntactic sugar — that is, an analogy, or a “little story”.

There’s a “little story” to be told on abstraction versus analogy here. In probability, we’re often working with “events” that are sets and ennumerable intersections and unions of sets (that is, with elements of an event algebra). When two events are disjoint — i.e. when they can’t happen together, and their intersection is empty — the operator “+” is often used as union notation instead of the standard U-cup. This is for clarity, but preserves compatibility with the fact that the probability of the disjoint union (“sum”) of two events is the sum of the probabilities of each event, as per the Kolmogorov axioms. (I’m thinking now that it should break for complementary sets of zero Lebesgue measure, but probably I don’t remember the technical details anymore). That is, P(A+B)=P(A)+P(B)

On the other hand, some authors omit the logical “and” (or the inverted-U cup for intersection) when talking about joint events and just concatenate event symbols to express this as in symbolic multiplication notation. The problem is that we can only say that P(AB) = P(A)P(B) when A and B are independent events — a property that can no longer be clearly expressed in set notation, and is often axiomatized, with “independence” being defined in terms of that equation.

In other words, when for “event sum” it seemed we were constructing an “event algebra” that would be entirely parallell to the arithmetic algebra over the probabilities of the events, the introduction of “event multiplication” shows that this notation is a leaky analogy, not an abstraction that’d allow for computation being done on the events and translated to probability at the very last moment. Statistical theory later allows for such “separation of concerns”, but alas, it’s not as simple as carrying notation over.

Other subthreads carried on over this analogy-vs-abstraction theme in one way or another. Cale emphasizes the plural nature of monads:

There’s also no one-true-way of thinking about monads. My Monads as Containers article describes another way to look at things, which works sometimes better for things like List, Maybe, various Tree monads, and even Reader, and eventually with more computation-like monads (State, IO) if you stretch your brain enough.

I love Monads as Containers — it’s an analogy (that is, it’s mental syntactic sugar!) without the semblance of an abstraction of the big-T Truth. It’s also great as an intermediate step to understanding monads since it answers the “ontological” question (what *are* the goddamn monads?) that popped up in my head without the snobjerk answer that monads are endofunctors on the category of types.

It also hints (although this is much better explained in his recent, badly-titled yet great The Monad Laws) to how monads are implemented in “plain Haskell”: a “box thing” like a particular monad has to be a particular type and something abstracting away the structure of certain box-things has to be a type-class.

The monads-as-containers view also manages not to be a leaky analogy, even though at the cost of being overly general. All syntactic sugar is an analogy, at some point; for example, square brackets for cons-lists emphasizes that (a) all elements of a list must be of the same type, (b) the structure of a cons-list is rather non-hierarchical (certainly not a tree), (c) cons-lists can be empty or have just one element, and one-element lists are still not the same as elements by themselves and (d) cons-lists are containers. Conversely, it’s not hard to think of syntactic sugar for Cale’s monads-as-containers analogy, though its overly-general nature would maybe lead to false analogy combination via false sylogism (flowerpots are containers, monads are containers, therefore flowerpots are monads!).

The bottom line to abstraction-analogy analysis is that analogies are very personal — a mapping between an outer world of hard technical facts and an inner world of squishy intuitive thoughts — and mandated analogies require a fine balancing act (not too general, not too specific) that’s often very hard to do.

In basic calculus the derivative-as-angle analogy is often used, and the “fit” is nearly perfect because there’s hard results that guarantee it; integral-as-area already starts to leak because of negative-valued regions of the domain (which means it’s not general enough as an analogy). Moreover, this pair of analogies doesn’t yield the basic result of the field (the fundamental theorem of calculus), and working directly with them forever means losing out big time, which is why limits have to be taught to every student, and moreover why the analogies are presented as firmly-specified theoretical results so that flowerpots-are-monads analogy mismatch doesn’t go around running amok.

The flip side to what seems until now a definitive condemnation of do-notation for being a leaky analogy is that analogy is not the only role of syntactic sugar, and more importantly not the role people people think of when engaging in syntax-based programming language religious wars. Do-notation does in fact enable the programmer to write monadic functions that would not otherwise (due to mental stack overflow) be factible to write.

The fact that it’s all we have for the time being does not mean (a) it’s already good enough and there’s no place to improve upon it and (b) it’s a good idea to mention it on page 3 of every tutorial way before polymorphic types and type classes are even mentioned. If basic IO must be mentioned (so compileable programs can be produced), *please* just define

`mangle f = getContents >>= putStr . f`

and vaguely mention that the outside world flows through `putStr . f`. Hey, maybe there’s a monads as aqueducts analogy waiting to be born there!

I’d like to once again thank all the fine folks participating in this discussion for their input and for provoking the new (for my brain) thoughts I’m trying to document here.

Filed under: How to teach Haskell, Mathematics and Haskell | 4 Comments

“interact” is your friend.

You write: “Moreover, this pair of analogies [derivative-as-angle, integral-as-area] doesnâ€™t yield the basic result of the field (the fundamental theorem of calculus), and working directly with them forever means losing out big time…”

Derivative-as-angle is a bad idea to start with. It doesn’t even generalize properly to a two-dimensional domain. It should have long ago been completely replaced by the much more general (and still easy to understand) local-approximation-by-a-linear-function.

However, integral-as-area generalizes quite well, even to the most abstract setting (measurable function on a measure space). What doesn’t generalize is one concrete and very simple-minded method of /calculating/ the integral (the so-called ‘Riemann-Integral’).

I mean, let’s say you have f(x) = x^3

Integral from (-a) to a of x^3 is 0. But the *area* is 2*integral of 0 to a of x^3.

See?

Hi! Why I can’t fill my info in profile? Can somebody help me?

My login is Kisakookoo!