Hieroglyph HOWTO, part I: Sparklines

Written by Jeff Heard on February 3rd, 2009
sline1
> module Main where
>
> import Data.List
> import Graphics.Rendering.Hieroglyph
> import System.Environment (getArgs)
> import qualified Data.ByteString.Lazy.Char8 as BStr
>

This is a simple demonstration of Hieroglyph for a non-interactive application.  The drawing is output directly to a PNG file of the specified width and height.

Formally, a visualization is a function from data to a Visual, where a Visual is an unstructured collection of Primitives.  These Primitives are documented in the module Graphics.Rendering.Hieroglyph.Primitives.  A purely functional framework such as this affords the programmer several advantages. Number one is that the drawing closely resembles the data — it is a straightforward
transformation, not unlike a stylesheet, and entirely without side effects or dependencies outside the drawing itself. Another advantage is that the drawing itself can be transformed and read by code. For example, the Visual can be indexed in an R-Tree or P-Tree and collision detection (mouse detection) becomes straightforward to implement. Another advantage is by knowing that
the drawing algorithm is a straightforward breadth first search without frills, it is easy to reason about why your drawing didn’t come out the way you expected it to. A final advantage is that you can be lazy about what you draw. Being declarative instead of imperative about drawing means that you don’t have to make breaks out of your code or write separate code to change level of detail, or to do off-screen culling. These become just a series of data transformations or a matter of how much of your Visual you pass to the renderer.

Alright. If I haven’t convinced you by now you ought to be doing your 2D
drawing in a purely functional way, I’m not going to.

To create a visulization in Hieroglyph, we write a function or functions from our data to a Visual. Then we use a transfer operation to take the Visual and put it to the page in main.

Our first simple visualization will be a sparkline. Sparklines are inline unscaled line graphs meant to replace or accent a table in a block of text. Let’s see how we do that. First of all, here’s the type signature of our sparkline, so you can see that while we’re defining a drawing, since there’s no IO, we aren’t actually doing the drawing yet.

> sparkline :: Point -> Double -> Double -> [Double] -> Primitive

Note that Point is a 2D point defined as two Doubles x and y by Hieroglyph. Primitive is the instance of Visual we’re using, since a sparkline can be represented by a single path. Now, on to the code.

> sparkline (Point startx starty) width height values = path{ begin=point0, segments=map Line points }

That one line defines the drawing itself, a line beginning at the leftmost x whose segments are compressed horizontally into the space we define by “width” and whose y values are compressed to height and scaled according to the values we passed in. A sparkline. It took me four lines of text, but it takes Hieroglyph and Haskell one line of code, with a bit of addenda.

>    where (point0:points) = zipWith Point xvals yvals
>          xvals = iterate (+(width/n)) startx
>          yvals = map (remap mx mn starty (starty+height)) values
>          (mx,mn,_,_,_,n) = stats values

Here we define our points in terms of x values and y values. The x values are in turn defined by a list of doubles starting with the leftmost point, startx and continuing on by increments of our defined width divided by the number of data points we’re trying to compress into that space. The y values are defined by re-mapping the range of values from its natural range, defined by mx and mn, to the geometry of our drawing, defined by height and starty. We define the stats function shortly and the remap function only a little further down. Note briefly, though, the underscores. Those mean that stats returns more values that we’re actually going to use, and we’re ignoring them. Now, due to the magic of laziness in Haskell, those values and their dependents are never actually calculated, saving us potentially a fair bit of performance.

I’m going to go ahead and define main for you before I define anything else, because then you can see how we go from a Visual, which is an abstract and purely functional entity to drawing on paper, which is not. Here goes.

> main = do

This line grabs the program arguments in the order we expect them. It’s not generally condoned to do things this way, but it prevents us from littering our example with bulky parameter checking code and from littering our import list with other modules for you to find and compile.

>   [dataset,widthAsc,heightAsc,outputfile] <- getArgs

There’s a lot going on in the next line. We map the function to the left of `fmap` over the values on the right, which are of type “IO BStr.ByteString” (appropriately, since they come from disc). Reading from right to left, the left side splits the string into lines, unpacks each line from our lazy representation to indivdual Haskell Strings, and then reads those Strings into Doubles.

>   values <- (map (read . BStr.unpack) . BStr.lines) `fmap` BStr.readFile dataset

Now, we briefly read our PNG width and height from the arguments, and then progress to the visualization.

>   let width = read widthAsc
>       height = read heightAsc
>       visualization = (sparkline origin width height values){ attribs = whiteStroke }
>       whiteStroke = plain{ strokeRGBA=white }

Here. These two short lines create our sparkline and set it to be black when we show it.

The next line is the main action of the program and transfers the drawing we’ve done to the physical drawing on disc.

>   renderToPNG outputfile (round width) (round height) visualization

And that’s it for main. Now to define the stats function as a function from a list of doubles to a tuple of crunched math:

> stats :: [Double] -> (Double,Double,Double,Double,Double,Double)

First define the finished product. The function foldl’ takes an initial value and folds the list into it, like one might fold egg-whites into a cake batter. The (x,x,x,x*x,1) is the initial value, defined by the first value in the list of data.

> stats (x:xs) = finish . foldl' stats' (x,x,x,x*x,1) $ xs

Now we define stats’, our folding function, which calculates the global extrema, list length, sum, and sum of squares all at once. Why, when Haskell defines for us so conveniently functions for maximum, minimum, sum, and length, would we want to do this? Simple. This is done in one list traversal instead of several. One for loop as opposed to five.

> stats' (mx,mn,s,ss,n) x = ( max x mx
>                           , min x mn
>                           , s + x
>                           , ss + x*x
>                           , n+1 )

Now we get to the finish. We’ve calculated the sum and sum of squares, but not the mean or standard deviation. We replace those now, and the task of computing summary stats is finished.

> finish (mx,mn,s,ss,n) = (mx,mn,av,va,stdev,n)
>    where av = s/n
>          va = ss/(n-1) - n*av*av/(n-1)
>          stdev = sqrt va

One last function to define before we’re done. Remap takes a value in the range [a,b] and remaps it to the range [c,d]

> remap :: Double -> Double -> Double -> Double -> Double -> Double
> remap mx mn mx' mn' x = (x-mn) / (mx-mn) * (mx'-mn') + mn'

That simple, and in a very few lines of code, and with no modules imported other than Hieroglyph, we’ve created a sparkline!

Hieroglyph contains primitives for lines, splines, arcs, quads, text, and loading and placing images. These can be used as we’ve seen in this module, in static images, or they can be used in interactive, dynamic visualizations by adding the Interactive scaffolding module.

2 Comments so far ↓

  1. Thomas Davie says:

    “Formally, a Drawing is an n-ary tree consisting of drawing Contexts and sub-
    drawings. The contexts describe modifications to the state of the pen and paper”

    So… not functional at all :(.

    Bob

  2. Jeff Heard says:

    Actually, I just can’t self-edit. That was old text about a vis being a tree. It hasn’t been a tree since two version ago or so. A visualization in Hieroglyph is a function from data to class Visual, where Visual is any unstructured collection of primitives.

Leave a Comment