Do we really need data scientists to parse our way through all that big data, or will programmers and engineers and admins handle things OK?
In a kind of weird coincidence, the same subject recently found its way into several business and technology trade press pubs -- almost as though it had been deliberately placed there. The topic was “the next big thing” in corporate career paths. It described a degree that you, if you’re unemployed, should be working to obtain to ensure your re-entry into a workplace that has left you behind, or what you should be pressing your children to pursue instead of those silly sheepskins in fields like philosophy, fine arts or history.
What was this high-and-to-the-right profession that was sure to propel its practitioners into the very Valhalla of corporate corner officedom? The authors of the pieces I read defined the career simply and inauspiciously as “data scientist.”
Wow, I thought. The moniker sounded somehow technical and even computer related. It was also very similar to “data management professional,” an idea I’ve been pressing for a long time to underscore the need to wrangle all our unruly bits into a form that will enable them to be protected, preserved and stored more efficiently. I always suspected that someone had been reading those words I had dedicated to defining this professional discipline within the framework of IT skills, knowledge and disciplines. Who cared if they’d captured every nuance or syllable of data management professional, at least they were referring to it as “data science.” I was thrilled.
But, when I read on, I saw that a data scientist was defined as someone who could read and interpret the results of big data analytics software. That’s a lot different from what I had in mind -- sort of a subset of a subset of a subset in my Venn diagram of future IT disciplines. In fact, the very idea of a data scientist, thusly described, kind of made me think of parallels in other careers. If you’re old enough, you may remember someone pumping gas into your car when you went to a filling station, operating the elevator and asking you which floor you wanted, or perhaps a human being taking your deposits and cashing your checks with a smile and a lollipop. Data scientists are like that, in my mind.
Think about it. Perhaps the first big data analytics will require an interpreter. Early models may be difficult to set up without the skills of someone who knows how to collect certain types of data and compare data meaningfully to yield still-cryptic grist for the empirical grindstone. Think of all those political analysts on 24-hour cable news networks who need to parse all the little bits of data to fill time and make meaningless observations about the present and its potential impact on the future. Data scientists may perform the same role with early big data analytics.
However, if IBM’s Jeff Jonas is correct, and big data analytics will eventually produce algorithms that mimic human reflection to solve increasingly self-generating questions, we won’t need someone to read tea leaves in the machine cup. The results will be set forth as statements of truth or self-generated action plans. The data science guy will be out of work as quickly as the oil lamp guy.
On the other hand, if algorithms do parse data in a reflective way that mimics too well the human mind, we may need a data scientist less than we will a disk whisperer or a data psychologist. We’ll need to help these big data algorithms deal with issues ranging from their upbringing to their psychoses and neuroses. What if the algorithm is used to spot voting fraud, only it finds no instances of voting fraud and begins to conclude that its mission is a cruel, cynical, politically motivated hoax intended to disenfranchise blocks of voters who may in fact vote for another candidate? Or what if the algorithm is let loose on financial trades, where it gets so enthusiastic that, instead of recommending market trades, it starts making them itself -- a lot of them and really fast -- without asking anybody? In case you haven’t noticed, these things are already happening and they’re creating a need for someone who can listen to the algorithm, help it get in touch with its inner statements or reconcile its behavior with its stimuli.
Sounds pretty complex. The alternative might be that we need a lot more competent programmers, system engineers and database admins. Only those names don’t sound quite as sexy as “data scientist.”
BIO: Jon William Toigo is a 30-year IT veteran, CEO and managing principal of Toigo Partners International, and chairman of the Data Management Institute.