### Finally some hacking of my own: a counter datatype

While examining ways to refactor the PFP library, it struck me that its strategy of storing repeated values and dealing with displaying them as an aggregate later was somewhat generalizable. This is a first stab at a counter datatype with constant-time update and (probably) O(m) lookup — where m is the sum of counts. The alternate strategy, using a Map, has O(log(n)) lookup and update time — where n is the number of counters.

What do we want to do here?

module Counter (count, inc, dec, empty, fromCounter, toCounter) where

We want to store a list of values together with counts, e.g.

*Counter> example

"Rock" 18

"Jazz" 27

"Classical" 13

" Etc." 35

We should be able to retrieve any count

*Counter> count "Rock" example

18

We want cheap `inc`

*Counter> inc "Rock" example

"Rock" 19

"Jazz" 27

"Classical" 13

" Etc." 35

`inc`

should handle gracefully (and still cheaply) new values:

*Counter> inc "Traditional" example

"Traditional" 1

"Rock" 18

"Jazz" 27

"Classical" 13

" Etc." 35

There should be a functioning `dec`

that also handles invalid cases gracefully:

*Counter> dec "Classical" example

"Rock" 18

"Jazz" 27

"Classical" 12

" Etc." 35

*Counter> dec "Mbembe" example

"Rock" 18

"Jazz" 27

"Classical" 13

" Etc." 35

There should be a simply `empty`

counter so new counters can be constructed via `inc`

chains:

*Counter> inc False $ inc True $ inc False $ inc True empty

False 2

True 2

Finally, there should be convenience functions to construct a counter from a list of (key, count) pairs

*Counter> toCounter [("Bjork",15),("Gentle Giant",12)]

"Bjork" 15

"Gentle Giant" 12

and to destruct a counter into such a list

*Counter> fromCounter example

[("Rock",18),("Jazz",27),("Classical",13),(" Etc.",35)]

Here’s how we implement it.

A counter is merely a list of values.

data Counter a = C {unC :: [a] }

`unC`

is not a field name, it’s an automatically generated destructor. Maybe I’m catching a bad habit from Erwig.

We increment a key’s counter by consing it to the beginning of that list:

inc, dec :: (Eq a)=> a -> Counter a -> Counter a

inc x = C . (x:) . unC

We decrement a counter by removing the first appearance of the key in the list

dec x (C y) = case (findIndex (==x) y) of

Just a -> C (take a y ++ drop (a+1) y)

Nothing -> C y

This is the empty counter:

empty :: Counter a

empty = C []

We retrieve a counter by, oh me, counting:

count ::(Eq a) => a -> Counter a -> Int

count x = length . filter (==x) . unC

The counter is trivially destructed into a list of (key,value) pairs from this function:

fromCounter :: (Eq a) => Counter a -> [(a, Int)]

fromCounter (C x ) = map (\y->(y,count y (C x))) (nub x)

To construct a counter, we employ an auxiliary function for n-increasing a count; we don’t export it elsewhere.

toCounter :: (Eq a) => [(a, Int)] -> Counter a

toCounter [] = empty

toCounter ((a,b):xs) = ninc b a (toCounter xs) where

ninc :: (Eq a) => Int -> a -> Counter a -> Counter a

ninc 0 x y = y

ninc n x y = inc x (ninc (n-1) x y)

Finally, our `Show`

method. Beware, it’s kind of ugly. First we display one key and its count

countShow :: (Eq a, Show a) => a -> Counter a -> String

countShow y x = (show y) ++ "\t" ++ (show (count y x))++"\n"

Then we define this ugly auxiliary function to keep track both of the whole counter and of the part that remains to count.

show_ :: (Eq a, Show a) => [a] -> [a]-> String

show_ _ [] = ""

show_ everything remaining = countShow (head remaining) (C everything) ++ show_ everything (tail remaining)

From here `show`

is trivial.

instance (Eq a, Show a) => Show (Counter a) where show (C x) = show_ x (nub x)

I hope I’m not reinventing the wheel. Or if I am, I hope it’s a smart wheel I’m reinventing!

Filed under: Uncategorized | 8 Comments

‘dec’ and ‘count’ will (in the worst case) have traverse the entire list. This means they both are O(n). This makes fromCounter O(n^2)! Using a “Map a Int” is a much better choice, fromCounter would be O(n), and all other operations O(log n).

I am also not sure if your semantics of dec are the best choice. If you simply allow a negative count then the following laws hold:

* inc a . dec a == dec a . inc a == id

* totalCount . dec a == subtract 1 . totalCount

where totalCount = sum . map snd . fromCounter

The implementation also becomes simpler if you use Data.Map, because it handles most stuff for you:

> data Counter a = C { unC :: M.Map a Int }

> empty = C M.empty

> inc a = C . M.insertWith (+) a (+1) . unC

> dec a = C . M.alter dec’ a . unC

where

dec’ Nothing = Just (-1)

dec’ (Just 1) = Nothing

dec’ (Just c) = Just (c – 1)

> count a = M.findWithDefault 0 a . unC

> toCounter = C . M.fromList

> fromCounter = M.toList . unC

> showCounter = unlines . map (\a c -> show a ++ “\t” ++ show c) . fromCounter

Well, as I said, I explicitly wanted very cheap inc. count will be only occasional, and dec/toCounter/fromCounter are included only for completeness’ sake.

Why use that double list traversal for dropping an element, instead of using

dec x (C y) = C . deleteFirstsBy (==) y [x]

Your n-inc can also be written using the design choices you’ve made:

ninc n x (C y) = C (y ++ replicate n x)

I would even go so far as to do

toCounter xs = C (concatMap (uncurry replicate) xs)

As for show-ing the whole thing, why not just do

show = unlines . map (\(x,y) -> show x ++ “\t” ++ show y) . (map(\x->(head x,length x))) . group . sort

Points taken!

It seems “bad” to have the Show instance require an Ord context.

It’s an Eq context. And how else do you propose having a counter over an Eqless map? Data.Map probably requires it too.

No, I mean in response to Mikael’s group . sort thingy. A Show instance shouldn’t, in my opinion, introduce any new constraints other than Show. I’m just saying that if your structure doesn’t require Ord, the Show instance definitely shouldn’t add it. If you change it to use a Map, then you have an Ord constraint, but it’s introduced by the structure not by the Show instance.

That’s because Show is doing on-the-fly most of the work that in a Map implementation is done upon insertion. This is a conscious design decision, to have cheap inc.