InfoVis recap, part I

The conference is not actually over, but I only registered for two days of it (back when I foolishly thought that I should be frugal and conserve my ed benefits). So I can write up my impressions, in no particular order.

A lot of the paper talks were on networks and trees, which are not directly related to what the EPDC is doing, but they were interesting nonetheless. Also, I was excited by how much I could understand, having taken Discrete. Some of the talks addressed ways to make scatterplots and lineplots more legible when there's a lot of overlap. A few papers took the approach of trying to "thin" the data while preserving overall patterns and any outliers. Other people developed interfaces that would allow people to view subsets of the data. This stuff seems like it might be useful to us, but I don't know how much time I'll have to play around with it. [It seems like my job description should really belong to two different people. Trying to get away from the distraction of the little requests that come in all day long, I'm going to do a little SQL workshop for some of the team, but I expect that for some complicated select queries (and all delete queries), they will still have to ask for my assistance].

Cool fact that I didn't know (or didn't remember): every planar graph has a planar rectilinear representation (can be drawn without resorting to curved lines). In practice, this doesn't really matter because the resulting rectilinear graph is unlikely to be easily intelligible.

I was surprised to learn that social scientists often(?) use matrix representations when studying social networks.

Someone made an interesting observation that most of the visualizations that we see seem to be developed by the authors in order to illustrate a point about their data. This is different than researchers actually using visualizations to find patterns in their data.

There was an image used in one of the presentations which is not in the paper and I haven't been able to find it with Google, so I'll try to describe it. It illustrated a friendship network among a group of 4th-graders (probably around 40 kids), and was drawn in the 1930s. Apparently each kid was asked to name who his/her friends were, and the resulting relationships were shown with single or double-pointed arrows connecting nodes (kids), depending on whether the friend-naming was reciprocated. It's a small data set but you can look at it and interpret such an interesting story. For example, there were two girls who named each other as their only friend, and were not named by anyone else. And friendships were completely same-gender except for one boy who named a girl as a friend. The presenter pointed out that she didn't name him back. It was really cool. Anyway, the point that this was supposed to illustrate was the value of clustering. In this case, all the girls were clustered separately from the boys. This made it easy to see that one lone line connecting the two groups, and draw some conclusions.

The session on geographic visualization was pretty good. Worldmapper, which I adore, was there, and I spoke with Anna briefly. The code that they use (with modifications) is publicly available, but unfortunately my skills are inadequate for battling the error messages. I would *love* to be able to make cartograms. Textmap was kind of interesting, but it looks like there's a very limited number of words that you can look at. Here's one for George W. Bush. The map is way down at the bottom of the page, but I like up near the top where they have a list of Bush's "favorite things" (based on how frequently these words appear in the same article as his name). By this criteria, it claims that his favorite song is "Dixie Chicks: Shut Up and Sing." Which I doubt. hehe.

Sites that I've been to before but want to explore again: Visual Complexity and Information Aesthetics.

A lot of the papers that were presented used generic data sets to illustrate some new technique for drawing networks/trees/graphs/whatever, but I found the most interesting presentations to be those that explored actual data. One of my favorites was Network Visualization by Semantic Substrates, which is demonstrating a particular technique as applied to data about citations of cases in the Supreme Court, federal courts, and circuit courts. The paper is available from that page, and I recommend reading it even though it's hard to get an idea of the coolness of the different display options. I haven't looked at them, but it looks like there's a PPT and a movie on the site also, which might be worth watching.

The winner for best paper was Hierarchical Edge Bundles, presented by Danny Holten, from TU/e, and I can't even find his page on the website because only some of the pages are translated into English... and I don't read Dutch. So here's a figure from the paper:

EdgeBundles

Unfortunately I'm having issues with Flickr so you can't click to see a larger size, but the basic idea is to bundle similar paths so that you can see overall trends. I have this paper (as well as all the other papers from the conference) in electronic form so if you'd like to see a more legible version, let me know and I'll send it to you.

Things I want to learn more about: parallel coordinate plots, circular hierarchy pie chart type things (not sure what these are actually called). One example is here:

circular

from MagnaView. It seems like a space-efficient way to present some types of data. I'm very tired so I'm not going to explain what this shows, but it's cool.

Tomorrow, more links.

Comments

Popular Posts