I saw this post already about Understanding Shakespeare using data visualization techniques, I’m just not sure how I feel about it. The play is presented as a grid of word clouds – characters across, acts and scenes down. The theory is that you can learn about a character’s progression through the play by looking at how their word frequency changes. Look, for example, at Hamlet. Tell me what you see? I can’t see anything enlightening, but maybe I’m missing it. I think what you could do with this is apply another level of semantic detail to it. Imagine if you could group all “light” and “dark” words together, and then look at Romeo and Juliet. Or Macbeth. Then, I think, you might start to see patterns. Or what if you could select out and compare usage of “you” versus “thou” in certain interactions between characters? I’m often told that this is a very important key to their relationships. There’s a version of this technique that somebody does every year where they do a tag cloud representing the current President’s State of the Union address. Over time, that’s fascinating. You see how some presidents spent more of their time talking about the Depression and economic issues, then some had to deal with war, Germany, Russia … all the way up to modern times where the word terrorism shows up and never goes away. I wonder if somebody could do Shakespeare’s usage over time, and see how his own vocabulary expanded. I think to be valid, though, we’d really need to know when he wrote everything, and I don’t think we can ever really know that.