MDS

...now browsing by tag

 
 

A quick buster example

Thursday, July 9th, 2009

Some people have asked of late for examples of Buster.  What follows is the (annotated) code that I used to generate the FDL video of the particles swarming around the lambda in the last post:

module Main where

import qualified Data.Array.IO as A
import Graphics.Rendering.Hieroglyph
import Graphics.Rendering.Hieroglyph.OpenGL
import App.EventBus
import IsoFDL
import Random
import Data.Colour
import Data.Colour.Names
import System.Random
import Control.Concurrent

dim = 480
minSeedCoord = fromIntegral $ quot dim 2 - 80
maxSeedCoord = fromIntegral $ quot dim 2 + 80
minIdealDist = (-100)
maxIdealDist = 100
npts = 100
basestrength = 4 

randomRIOs :: Double -> Double -> IO [Double]
randomRIOs a b = do
    g <- getStdGen
    return $ randomRs (a, b) g

This is the only behaviour we define, adjustLayout, and it runs a single layout iteration over the data.  We take the old locales of points, compute an FDL iteration, take the new locales of points, and produce the event, which is “Visible” (this is important, as Hieroglyph looks for all items in the Visible group, assumes they are o the class Visual, and attempts to render them.  We also produce a re-rendering request, and return the list of rerender and our layout as a future to be used at the appropriate time by the buster framework.  One final thing to note about our event is that it should live for 10 iterations.  Often geometry is persistent, but this was used to give the “contrail” effect to the particles as they move, effectively layering slightly transparent geometry over the top of older slightly transparent geometry to create the effect.

The Hieroglyph code is also fairly straightforward.  We apply linewidth and line color combinators (using the colours in Russell O’Connor’s excellent Data.Colour library) to a list of paths which are slightly exaggerated (that’s the +/-(x1-x0) and +/-(y1-y0) operations on the Points in the path) vectors along the path that was most recently traveled.

adjustLayout :: [Attractor] -> (Arr (Int,Int) Double) -> (Arr Int Double) -> (Arr Int Double) -> Behaviour [EData HieroglyphGLRuntime]
adjustLayout poles dists xs ys b = do
    pts <- externalizeLayout xs ys
    layoutIteration basestrength poles xs ys dists
    pts' <- externalizeLayout xs ys
    event' <- produce "Visible" "" (show . head $ pts') (Iterations 10)
                . (:[])
                . EOther
                . Geometry
                . linewidth 1
                . strokecolour (dissolve 0.3 (opaque darkblue))
                . name "points"
                $ zipWith (\(x0,y0) (x1,y1) -> path{ begin=Point (x0-(x1-x0)) (y0-(y1-y0)), segments=[Line $ Point (x1+(x1-x0)) (y1+(y1-y0))] }) pts pts'
    rerender <- produce "Hieroglyph" "" "Rerender" once []
    future b . return $ [event',rerender]

The next function, behaviour would normally be pure, but because we have some data that is internal to the adjustLayout behaviour, it’s got a do.  One of the things that I will change in the near future about Hieroglyph and OpenGL is to make the events it can respond to more polymorphic.  Currently I require that to use Hieroglyph in OpenGL that the event data type is [EData HieroglyphGLRuntime].  That’s pretty inane if I do say so myself, and in the future, there will be a constraint more along the lines of (HieroglyphInteractiveEventData a) => a rather than a static type to be determined at compile time.  However, with the current state of things, this is our behaviour, and it’s still pretty straightforward.  renderBehaviour depends on the result of adjustLayout.  That’s it.  The <~< establishes the dependency relationship and buster takes care of the rest.  The “poles” are a series of attractors (defined in IsoFDL.hs), which are static fields attached to a fixed point in space, attracting or repelling depending on the constant given them.

behaviour = do
    randomDistances <- randomRIOs minIdealDist maxIdealDist >>= A.newListArray ((0,0),(npts-1,npts-1)) . take (npts^2) . drop (2*npts)
    randomXs <- randomRIOs minSeedCoord maxSeedCoord >>= A.newListArray (0,npts-1) . take npts
    randomYs <- randomRIOs minSeedCoord maxSeedCoord >>= A.newListArray (0,npts-1) . take npts . drop npts

    poles <- mapM (\(x,y) -> compileAttractor . Attractor x y . take npts $ repeat 150)
        [(100,(440-100)), (170,(440-160)), (220,(440-220)), (160,(440-280)),
         (100,(440-320)), (280,(440-280)), (350,(440-320)), (380,(440-300))]

    return $ renderBehaviour <~< adjustLayout poles randomDistances randomXs randomYs

main = behaviour >>= boilerplateOpenGLMain  [initializeBus "Testing Force Directed layouts" dim dim]

Finally, our main function is incredibly simple.  We simply bind the behaviour to the boilerplate buster OpenGL main.  The other argument to boilerplateOpenGLMain is a list of “widgets” to be bound to the bus at the beginning of the application.

An ugly force-directed layout implementation

Thursday, July 9th, 2009

Updated 7.9.2009: I’ve added another video showing the effect of attractors on the layout

Unalbe to show flash video

I’ve been working on visualizations silently of late.  Today, though, I threw together a quick FDL (force-directed layout) algorithm based on POV-Ray’s static-field isosurface or “blob” equation, which is (sorry, but I’m using plaintext)

density = strength * (1 - (distance / radius)^2)^2

The documented equation has a problem, in that if distance > radius, the density approaches infinity fast, so I change it to be:

density = strength * (1 - (min(distance, radius) / radius)^2)^2

That solves the infinity problem, and now all there is to the algorithm is to apply it:

adjust :: Double -> Double -> Double -> Double -> Double -> Double -> (Double,Double)
adjust sx sy ss sr dx dy = if isNaN vx || isNaN vy then (dx,dy) else (dx+force * vx, dy+force * vy)
    where force = density ss sr dist
          dist = sqrt ((dx-sx)^^2 + (dy-sy)^^2)
          vx = (dx-sx) / dist
          vy = (dy-sy) / dist

Note that it is possible for dist to be NaN for a single point-to-point interaction, so we correct for that.  Basically, this function is a single adjustment from point set to a point set that fits our force constraint a little better.  In the following code, we take a static vector, <sx,sy> and a movable vector <dx,dy> and take the distance.  We apply the density function to the distance using ss as our strength and sr as the radius or distance at which the density function falls off to zero.  We return the translated point P’ which is translated along the normal vector <vx,vy> by the variable “force”. Simple and straightforward.

Now I know… by my title, you’re thinking, “What’s ugly about that?” Well, when I code fast, I use shortcuts.  The following code, which uses this adjustment function as a kernel for the actual force-directed layout algorithm is imperative.  I’ve tried FDL and MDS (multi-dimensional scaling) algorithms before using lists and tries and always ran into enough overhead that it significantly diminished the number of points that are viable to run a single iteration in real-time.  I’m sure there’s a way to do it, but I solicit the community’s help in suggesting a faster way than this.  Yes, I could have used STUArray and avoided IO, but that doesn’t really eliminate the imperative nature of things.

layoutIteration :: Double -> [Attractor] -> (Arr Int Double) -> (Arr Int Double) -> (Arr (Int,Int) Double) -> IO ()
layoutIteration alpha attractors xsArray ysArray radii = do
    bounds <- A.getBounds radii
    forM_ (A.range bounds) $ \(r,c) -> when (r /= c) $ do
        radius <- radii -| (r,c)
        x0 <- xsArray -| c
        y0 <- ysArray -| c
        x1 <- xsArray -| r
        y1 <- ysArray -| r
        let (x',y') = adjust x0 y0 alpha radius x1 y1

        xsArray =| r $ x'
        ysArray =| r $ y'

    forM_ attractors $ \(CAttractor ax ay arArray) -> do
        bounds <- A.getBounds arArray
        forM_ (A.range bounds) $ \ix -> do
            x0 <- xsArray -| ix
            y0 <- ysArray -| ix
            ar <- arArray -| ix
            let (x',y') = adjust ax ay (-alpha) ar x0 y0
            xsArray =| ix $ x'
            ysArray =| ix $ y'

As you can see, this is pretty ugly, even with the aliased (-|) readArray and (=|) writeArray binary operators cleaning things up a bit.  It’s straightforward, and random access to the points is O(1).  There are algorithms for which this is more important than this one, and I can see a list comprehension version coalescing even as I write this, but this kind of code is O(N^2). The constants are sometimes (depending on the adjustment function) quite high, so what I really need is the fastest version possible, probably taking a lot of advantage of fusion in the compiler.  The other problem, which is more subtle,  is that points are updated multiple times in one pass and we always want to use the latest point.  Any functional solution would have to take this into account (and storing all the update deltas and taking their centroid won’t work, because each update wouldn’t then be based on the point’s most current position).

This one seems to handle up to about 250 points quite well, which is decent for an FDL algorithm (keep in mind, that’s 10000 distance calculations per iteration, all of which have to happen before sending any points to the video card and other such matters — doing this 30-50 times a second along with other functionality is harder than it sounds).  Handling more than that would require one of two modifications to the algorithm: either select a random N moves to make per iteration or subdivide the point set into sqrt(N) size blocks, perform the FDL on each of those, and then on their centroid, translating them all by the delta.

So essentially, there were two purposes to this exercise.  One was coding an FDL algorithm in Haskell, but the other was trying out an FDL I hadn’t seen tried before.  This one in particular does not result in the usual circular pattern arising and also handles negative forces as easily as positive forces with a simple, low-overhead function. Oh, and of course the other was showing that I could get realtime performance of such an algorithm (that previously people insisted had to be done in C to be done properly) with a relatively naive implementation.

Force directed layout algorithms are useful for tasks like graph layouts and other dense datasets as well as for performing a sort of “visual” clustering.   Here’s an example showing the starting state, with all points clustered in a square in the middle of the screen:

screenshot-testing-force-directed-layouts-1

And here is one after the layout has run for a few hundred iterations:

screenshot-testing-force-directed-layoutsThe evenness reflected here is because the ideal distances I selected are random.  If there is a system to the distances, there will be a system to the layout.  The neat thing about this algorithm is that given any set of input points and distances, the result of the algorithm at any point is deterministic. Follows is the code that I used to generate the above pictures and a short movie of the forces in action.

This is the code for the algorithm and a test program in Hieroglyph/Buster

A video of the algorithm in action

Unalbe to show flash video