Even a complete idiot like me can write a simple parser in 20 minutes in Haskell.

While basically killing off time, it struck me that I wanted an english-to-nadsat translator. Wikipedia pointed me to this page, where there is a translator for Windows. The very same page has the dictionary it uses, in the following form:

A
appy polly loggies : apologies :: School boy speak
appy polly loggy : apology :: School boy speak
B
baboochka : old woman :: Russian (babooshka/grandmother)
baboochkas : old women :: Russian (babooshka/grandmother)
baddiwad : bad :: School boy speak
baddiwadest : baddest :: School boy speak
banda : band :: Russian (banda/band, gang)
bandas : bands :: Russian (banda/band, gang)
bezoomny : mad :: Russian (byezoomiyi/mad, insane)

Yes, this probably can be handled with regexes as it’s regular enough, but that’s too complicated for me. I’m not smart enough for that finite automata stuff.

So I copied all that text as-is and saved as nadsat.dict. Then I began writing the parser.


module Nadsat where

import Text.ParserCombinators.Parsec
import qualified Data.Map as Map
import Data.Char

First thing I wanted was to weed out the “A”, “B”, etc. headers


header = oneOf ['A'..'Z']

Then I wrote a parser for one line of dictionary


word = do {
nadsat <- anyChar `manyTill` (char ':');
english <-anyChar `manyTill` (string "::");
etimology <- anyChar `manyTill` newline;
return $ Map.singleton (filter isAlpha english) (filter isAlpha nadsat);
}

Given that, writing the parser for a whole dictionary is easy:


dict = do {
discard <- skipMany header;
words <- many word;
return $ Map.unions words;
}

The trickiest part (and it’s not really tricky) is this:

nadsatDict = parseFromFile dict "nadsat.dict"

The type of that function is nadsatDict :: IO (Either ParseError (Map.Map [Char] [Char])), because trying to parse a file might throw an error. Luckily, Either ParseError b is an instance of Functor, so fmap solves the problem of using that Map dictionary easily.

This function does the bulk of translating, given a Map containing the dictionary.


subst s m = unwords $ map (\x-> Map.findWithDefault x x m) (words s)

Finally, we write main, trivially (modulo the fmap thing I mentioned before):


main = do { text<-getContents; dict<-nadsatDict; print $ fmap (subst text) dict; }

Bingo! Nadsat translator in 20 minutes. Even a complete idiot like me, never having had programming lessons or a CS education at all, can write a simple parser in Haskell in a few minutes.Can your language do that?



3 Responses to “The 20-minute parser”  

  1. Yes, I know I could use join findWithDefault instead of the lambda expression in subst.


  1. 1 The 20-minute parser Data.Syntaxfree | Bibles

Leave a Reply