As the volume of Big Data grows, so do the career prospects for the experts who analyze this information, solve complex problems, and make informed, data-driven decisions for the future.
That is, data analysts and data scientists.
What is the difference between the two? It’s complicated, in part because the disciplines are so closely related. But perhaps one way to differentiate these growing fields is to introduce yet another term, one that applies specifically to data science: That term is data product.
Solving Complex Problems
What is a data product?
“The traditional answer to this question is usually ‘any application that combines data and algorithms,’” said Benjamin Bengfort, Ph.D., Program Director for the Data Science Certificate Programs at Georgetown University. “But to be frank, if you’re writing software and you’re not combining data with algorithms, then what are you doing? After all, data is the currency of programming.”
Bengfort suggested a more workable definition.
“More specifically, we might say that a data product is the combination of data with statistical algorithms that are used for inference or prediction.” And this kind of process—a hypothesis-driven analysis that employs machine learning to create and solve complex problems, and make predictions—this is data science.
“The outcome of data science,” Bengfort said, “is a technology that automates decision making—usually at the individual level.”
So, what is data analytics? It is also question-driven, but in this case, the principle purpose is not to create hypotheses that could result in new data products, but to thoroughly analyze exiting datasets to more deeply understand the processes from which they were derived.
“The primary product of data analytics is insights,” Bengfort said. “The primary outcome of data analysis is data-driven decision-making—usually at the strategic level.”
A Focus on Ethics
In Georgetown’s data certificate programs, students engage in both data analytics and data science. The main differentiator is the student’s level of experience in both fields. For example, the 12-week Data Analytics Bootcamp provides students with a solid foundation in both data analytics and data science. The Certificate in Data Science—a cohort program consisting of eight courses over six months—goes deeper into data science and assures employers that those completing the program are well-versed in the topic.
Recent participants have worked at a variety of private- and public-sector jobs, including: the Department of Transportation, Bureau of Labor Statistics, Department of Defense, International Monetary Fund, U.S. Army, Google, and Amazon.
One aspect of all the certificate programs that is unique to Georgetown is a focus on professional ethics.
“I would say that if you put 10 data scientists in a room and asked them about the ethical implications of their work, nine of them would say, ‘There are no ethical implications—it’s just math,’’’ said Bengfort, a Navy veteran. “To me, that’s the equivalent of saying, ‘I was just following orders.’ But as professionals, and especially mid-career professionals, we are not drones that are bound by the math. We do have ways to influence what we’re doing, and we have a responsibility to consider the implications.”