Finally some hacking of my own: a counter datatype

11Feb07

While examining ways to refactor the PFP library, it struck me that its strategy of storing repeated values and dealing with displaying them as an aggregate later was somewhat generalizable. This is a first stab at a counter datatype with constant-time update and (probably) O(m) lookup — where m is the sum of counts. The alternate strategy, using a Map, has O(log(n)) lookup and update time — where n is the number of counters.

What do we want to do here?


module Counter (count, inc, dec, empty, fromCounter, toCounter) where

We want to store a list of values together with counts, e.g.


*Counter> example
"Rock" 18
"Jazz" 27
"Classical" 13
" Etc." 35

We should be able to retrieve any count


*Counter> count "Rock" example
18

We want cheap inc


*Counter> inc "Rock" example
"Rock" 19
"Jazz" 27
"Classical" 13
" Etc." 35

inc should handle gracefully (and still cheaply) new values:


*Counter> inc "Traditional" example
"Traditional" 1
"Rock" 18
"Jazz" 27
"Classical" 13
" Etc." 35

There should be a functioning dec that also handles invalid cases gracefully:


*Counter> dec "Classical" example
"Rock" 18
"Jazz" 27
"Classical" 12
" Etc." 35
*Counter> dec "Mbembe" example
"Rock" 18
"Jazz" 27
"Classical" 13
" Etc." 35

There should be a simply empty counter so new counters can be constructed via inc chains:


*Counter> inc False $ inc True $ inc False $ inc True empty
False 2
True 2

Finally, there should be convenience functions to construct a counter from a list of (key, count) pairs


*Counter> toCounter [("Bjork",15),("Gentle Giant",12)]
"Bjork" 15
"Gentle Giant" 12

and to destruct a counter into such a list


*Counter> fromCounter example
[("Rock",18),("Jazz",27),("Classical",13),(" Etc.",35)]

Here’s how we implement it.

A counter is merely a list of values.


data Counter a = C {unC :: [a] }

unC is not a field name, it’s an automatically generated destructor. Maybe I’m catching a bad habit from Erwig.

We increment a key’s counter by consing it to the beginning of that list:


inc, dec :: (Eq a)=> a -> Counter a -> Counter a
inc x = C . (x:) . unC

We decrement a counter by removing the first appearance of the key in the list


dec x (C y) = case (findIndex (==x) y) of
Just a -> C (take a y ++ drop (a+1) y)
Nothing -> C y

This is the empty counter:


empty :: Counter a
empty = C []

We retrieve a counter by, oh me, counting:


count ::(Eq a) => a -> Counter a -> Int
count x = length . filter (==x) . unC

The counter is trivially destructed into a list of (key,value) pairs from this function:


fromCounter :: (Eq a) => Counter a -> [(a, Int)]
fromCounter (C x ) = map (\y->(y,count y (C x))) (nub x)

To construct a counter, we employ an auxiliary function for n-increasing a count; we don’t export it elsewhere.


toCounter :: (Eq a) => [(a, Int)] -> Counter a
toCounter [] = empty
toCounter ((a,b):xs) = ninc b a (toCounter xs) where
ninc :: (Eq a) => Int -> a -> Counter a -> Counter a
ninc 0 x y = y
ninc n x y = inc x (ninc (n-1) x y)

Finally, our Show method. Beware, it’s kind of ugly. First we display one key and its count


countShow :: (Eq a, Show a) => a -> Counter a -> String
countShow y x = (show y) ++ "\t" ++ (show (count y x))++"\n"

Then we define this ugly auxiliary function to keep track both of the whole counter and of the part that remains to count.


show_ :: (Eq a, Show a) => [a] -> [a]-> String
show_ _ [] = ""
show_ everything remaining = countShow (head remaining) (C everything) ++ show_ everything (tail remaining)

From here show is trivial.


instance (Eq a, Show a) => Show (Counter a) where show (C x) = show_ x (nub x)

I hope I’m not reinventing the wheel. Or if I am, I hope it’s a smart wheel I’m reinventing!



8 Responses to “Finally some hacking of my own: a counter datatype”

  1. 1 Twan van Laarhoven

    ‘dec’ and ‘count’ will (in the worst case) have traverse the entire list. This means they both are O(n). This makes fromCounter O(n^2)! Using a “Map a Int” is a much better choice, fromCounter would be O(n), and all other operations O(log n).

    I am also not sure if your semantics of dec are the best choice. If you simply allow a negative count then the following laws hold:
    * inc a . dec a == dec a . inc a == id
    * totalCount . dec a == subtract 1 . totalCount
    where totalCount = sum . map snd . fromCounter

    The implementation also becomes simpler if you use Data.Map, because it handles most stuff for you:
    > data Counter a = C { unC :: M.Map a Int }
    > empty = C M.empty
    > inc a = C . M.insertWith (+) a (+1) . unC
    > dec a = C . M.alter dec’ a . unC
    where
    dec’ Nothing = Just (-1)
    dec’ (Just 1) = Nothing
    dec’ (Just c) = Just (c – 1)
    > count a = M.findWithDefault 0 a . unC
    > toCounter = C . M.fromList
    > fromCounter = M.toList . unC
    > showCounter = unlines . map (\a c -> show a ++ “\t” ++ show c) . fromCounter

  2. Well, as I said, I explicitly wanted very cheap inc. count will be only occasional, and dec/toCounter/fromCounter are included only for completeness’ sake.

  3. Why use that double list traversal for dropping an element, instead of using
    dec x (C y) = C . deleteFirstsBy (==) y [x]

    Your n-inc can also be written using the design choices you’ve made:
    ninc n x (C y) = C (y ++ replicate n x)
    I would even go so far as to do
    toCounter xs = C (concatMap (uncurry replicate) xs)

    As for show-ing the whole thing, why not just do
    show = unlines . map (\(x,y) -> show x ++ “\t” ++ show y) . (map(\x->(head x,length x))) . group . sort

  4. Points taken!

  5. 5 kalmar

    It seems “bad” to have the Show instance require an Ord context.

  6. It’s an Eq context. And how else do you propose having a counter over an Eqless map? Data.Map probably requires it too.

  7. 7 kalmar

    No, I mean in response to Mikael’s group . sort thingy. A Show instance shouldn’t, in my opinion, introduce any new constraints other than Show. I’m just saying that if your structure doesn’t require Ord, the Show instance definitely shouldn’t add it. If you change it to use a Map, then you have an Ord constraint, but it’s introduced by the structure not by the Show instance.

  8. That’s because Show is doing on-the-fly most of the work that in a Map implementation is done upon insertion. This is a conscious design decision, to have cheap inc.


Leave a comment