Information visualization is a very applied field that prides itself on its useful real-world applications. At the same time, it is missing a culture of reflection that would allow us to distill deeper knowledge from individual systems. Such knowledge would be easier to transfer between systems and also provide a stronger foundation for the field.

My goal is to develop a deeper understanding of visualization. By building models in both a top-down and a bottom-up approach, I hope to develop a framework that will allow us to get a better grasp on how and why visualization works, and what makes it unique. This includes not only insights into perceptual and cognitive aspects of visualization, but also a better grasp of what kinds of problems are best solved using visual means (and why).

While the visual encoding of individual data items is well understood, how these marks work together is not. My student Caroline Ziemkiewicz and I have performed a number of experiments that highlight the role the overall visual structure plays in visualization, and have made some progress in understanding how it works. After finding unexplainable differences in the performance of different tree visualizations (like treemaps and node-link diagarms), we decided to investigate this class first. We were able to show a connection between the linguistic metaphor study participants were primed with (levels or containment) on performance depending on whether the visual metaphor used was compatible with the one used in the question or not.

To take this further, we decided to look at the role of design elements on the perception of visualizations. Our study shows not only a strong semantic influence of seemingly unimportant design decisions (such as borders around the parts of a pie chart, space between parts, etc.), but also hinted at a physically-inspired interpretation of even very simple, abstract charts. We also investigated role simulation plays in the perception of charts and visualizations. Study participants seem to be interpreting the simple shapes they see as physical objects and perform simple simulations of their potential movements when reading a chart. This has important consequences not just for the overall semantics, but even influences the values people remember from charts.

With a large number of studies necessary to test different hypotheses, we came up with the novel idea of running them on Amazon's Mechanical Turk service. Designing studies for this environment presents certain challenges, but also provides a means of getting data back very quickly, and from a broader range of people than in the typical university lab study. User studies are an important part of my research, and have been since the beginning of my work in visualization. Understanding not just how to best conduct them, but also their limitations, is a particular interest of mine.

Empirical studies are undoubtedly key to understanding how visualization works, but I believe that we also need other models to think about visualization. I have argued that visualization is not a hard science, and that we need to look around more for models of thinking beyond psychology and the technical sciences. One area that I believe to be especially promising is art theory and criticism. The aesthetic criterion of the sublime is a particularly effective means of understanding the difference between visualization, information graphics, data art, etc. But even within visualization, using Nelson Goodman's ideas about visual languages and notationality provides a good way of understanding why different kinds of visualization present different kinds of problems and require different ways of thinking about them.

While I consider theory to be important, I do not believe in its value only for its own sake. Like most of my peers, I have developed useful tools and techniques that are applied to real-world data and problems. My work on Semantic Depth of Field (SDOF) is used in a system for in-silico drug discovery, and has many other applications. The Parallel Sets technique is one of only a handful of ways of visualizing categorical data. The program was published as an open-source project, and is being actively developed and maintained. I have also worked on wire transaction data for fraud detection and extracting the hidden social networks from a database of terrorist attacks. The use of visualization for communicating data is also a particular interest of mine.

Developing better systems requires a deeper understanding of how visualization works. We know a few things, but there are still vast areas we hardly know exist. We need to keep digging deeper and building a more solid theoretical foundation for the practical work this field values. We cannot expect to build ever better and more useful systems without understanding what they are made of.

The engineering side of visualization is well developed and produces useful systems, but the science of visualization is still in its infancy.