Are maps the new gauges?

Over the past couple of years most  data visualization vendors have been adding spatial / mapping related functionality to their product suites. The first iterations were cumbersome to use with special geographic data types that needed to be projected onto custom maps. Today it is much, much simpler with capabilities to automatically map geography related attributes (such as state and zip code). This lets existing data sets be plotted onto maps without the need for spatial references such as longitude/latitude or complex vector shapes. When doing this for the first time it is almost magical. You select a measure, specify some geographical attributes and presto: Bars appear on the map in the right places. For us data enthusiasts this leads to a mapping frenzy where we take every data set in our repository and project it onto maps in more and more intricate ways. This was the exact same thing that happened when I first started playing around with gauges  (speedometers, thermometers  etc.) and other “fancy” visualizations when they became available oh so many years ago. Today I roll my eyes at that kind of wasted “artistry”:  So many pixels, so little information. So after having cooled down from my initial childish joy over a new way to display data I started thinking about its value.

When it comes to data visualizations I always ask myself: Does this add value to the data compared to displaying it in a simple table? With gauges its pretty easy to answer that one: No. With maps? A little more difficult.  The thing is: Maps encode information that is useful in itself and is universally understood. Information such as location, distance and area are all easily grasped by basically anyone looking at a map. Plotting data points into a map can add value by leveraging this. Here are some examples:

  • Highlight clusters through color coding.
  • Give a scale of density of some occurrence.
  • Show the distance between occurrences of something.

However the data itself must be of a kind where this information is not readily apparent. For instance a map of the US with states color coded by the percentage they contribute to total sales (who has not seen this?) does not add any value compared to a table. The map is not adding any context to the data, it is basically there for show. Much like the good old gauges.  My point is that the data needs to be geographically relevant. What we show has to relate to the information inherently present in geographic encoding.  The volume of data also has to be big enough so that these relationships are not obvious or significant work would need to be done to categorize them in order for them to make sense. A good example of this is the “Chicago Crime Data” sample data set provided with the public preview of GeoFlow for Excel (scroll down a bit on the page). Here we see how the map adds a lot of understanding to a data set that is geographically relevant. Deducting the insights we get from simply looking at the clustering in the map would be impossible by simply scrolling through the data set.  If we were to present this in tabular form we would have a very hard time conveying the spatial information a map gives us. A lot of upfront work would need to be done to create the kind of clusters and spatial information the map gives us.

So in short: Are maps the new gauges? I would say not really. There is true value to be exploited by projecting data points onto a map. But as always, the right  tool should be used for the job at hand.

SQL Server Analysis Services StressTester Beta 1.1.3

Fixes numerous bugs including:

  • Thread related issues.
  • Wrong timing of queries.
  • Only one instance of the server / client is allowed to run at a time on the same machine.
  • Query counts not updating correctly.
  • Network code made a little more efficient by sending query results from the client to the server in batches of five.
  • Overall memory consumption lowered significantly in the Server.

SQL Server Analysis Services StressTester Beta 1.1.2

Update: There is an issue with how the test progress is displayed in both the server and client. This is due to the new multithreading support which wreaks havoc with my variables and / or events. This does not affect the execution of the test itself, just the progress report.

Note that I have changed the version scheme to align with the codeplex scheme.

This release fixes some nasty bugs and adds a couple of features:

  • Fixed a bug where the server would crash if clients could not connect to the target.
  • Added tool-tips to most controls.
  • Added a log window to the server that displays server and client activity.
  • Added the option to run multiple threads on clients.
  • Expanded the delay feature so that client threads pause a random number of milliseconds between a low and high value before issuing the next query.
  • Made the server a lot less resource (CPU) intensive.

Note that existing tests (.test) will break with this release. 

SQL Server Analysis Services Stress Tester Alpha 2.1 release

Some minor stuff:

  • Query editor now accepts newline
  • Added option to clear SSAS cache before executing test

If you have saved tests in the previous version these will break unless you add the following to the xml under the <test> root:

Yes, I am still planning to do some documenting 😉