We inhabit a world where the boundaries between literacy and numeracy have become increasingly fuzzy, and where all media have become fundamentally multimedia. Likewise, as consumers of information, we have grown increasingly accustomed to the role of data visualization in everyday life. From complex maps of World Cup matchups, to responsive infographics that guide our home-buying and rental choices, to animated plots that teach us what we each can be doing to help flatten the curve, interactive data visualization has become a critical part of how we understand the world around us and make decisions.
As data visualization becomes more widespread, the ability to distill large amounts of data for a general audience is one of the most valued skills of data scientists - and you don’t have to be a graphic designer to do it! Below is a Q&A I hosted with Laura Lorenz, Senior Software Engineer at Prefect to discuss how you can make rich, interactive, responsible visualizations part of your data engineering toolkit.
Bengfort: Why is interactivity important to visualizations?
Lorenz: Whenever I bring a static visualization to a meeting, usually for some first-pass analysis embedded into a PowerPoint (the old school way), there are always more questions. There’s another viewpoint I missed. This becomes so much more self-serve if the visualization is built interactively first. Sometimes there’s a whole other slice that shows a really critical distinction. Locking your data into one analysis isn’t necessary anymore with all the interactive libraries we have today. And sometimes, you can’t even really “see” the story of a visualization without the benefit of an animation. Time series data is a great example. There’s often a great benefit to animating time series data to showcase the passage of time. You recently recommended the late Hans Rosling’s TED Talk about this topic, which I would recommend as well to everyone to see. He shows how adding interactivity onto a few visualizations can really reveal incredible insights.
Bengfort: What tools or libraries do data scientists need to learn to produce interactive visualizations?
Bengfort: The enthusiastic response of the data visualization community to the COVID crisis has drawn mixed feedback; how much do data scientists need to understand about the data they're plotting? How should we navigate the responsibilities of data visualization?
Lorenz: I’m glad you phrase it in terms of ‘responsibilities’. While we were brainstorming the curriculum for the Certificate in Advanced Data Science, we were coming up with the program outcomes we wanted students to walk away with. We were on the visualization outcomes. We were brainstorming adjectives for good visualizations like “robust”, “rich”, “interactive”, “compelling”, and then we stopped and said, “None of these are about ethics”, in which case I suggested the word “responsible”. It really clicked. Your visualizations will be responsible for telling a story, for changing people’s minds, which changes their behavior. Visualizations are memorable, easy to share; they have a lot of power. Regarding the COVID crisis, the “flatten the curve” visualizations were the backbone of a public health communications strategy that depended on a visual snapshot someone could share easily on Facebook. That graph is going to be part of mainstream consciousness for a long time. There’s no one answer to this, but it’s something you have to be cognizant about all the time. There are some basic rules of thumb, you know, keeping your axes ranges similar, that type of thing, which can be memorized. But there is also an intuition you can build.
Bengfort: How much data is it possible to visualize? Is there such a thing as too much data?
It is easy to see that data science requires thoughtful consideration of how we use data to aid decision making or even automate it. Design and creativity combined with rigorous and programmatic methods are required to ensure data visualization is not just successful, but also effective and responsibly communicating insights to information consumers.If you’re interested in learning more about interactive data visualization, Laura is teaching a course on interactive data visualization in the Certificate in Advanced Data Science program this fall.