Driverless cars. Self-diagnosing X-rays. Simultaneous translation machines. Thanks to the field of data science, these things are no longer science fiction.
Rowben Consulting’s Stephen Rooke spoke with Data Scientists John Ward and Munish Mehta and discovered how this in-demand profession is fundamentally changing the way we live.
Modern businesses track everything, resulting in incredible caches of data. Mine this data appropriately and you can strike gold. Hidden within is information that can lead to incredible opportunities – increased productivity, new revenue streams and amazing advancements.
Cue the rise of the Data Scientist – a multi-disciplinary professional versed in mathematics, computer science and analytics who has the know-how to extract useful knowledge from this data mass.
“It’s a multi-skilled job,” says Munish Mehta, who’s been a Data Scientist for over 10 years. “It’s not something just an IT person can do. A statistician alone cannot be a Data Scientist. You need to have a range of skills and know your stuff. Apart from understanding the science, you need mathematical skills, statistical skills. You need to know technology and understand programming languages.”
The number one skill requirement? “Math, math, math,” says John Ward, a fellow Data Scientist with more than 8 years experience. “You screw that up and you make some absolutely fundamental mistakes.”
To an outsider, the profession is somewhat shrouded in mystery. What exactly does a Data Scientist do?
“The core thing is to be able to understand the data,” explains John. “Ninety per cent of our job is to make data functional.”
“A Data Scientist should look at the data with the goal of getting the required information out of it,” adds Munish. “They should ask, ‘How do I get that information out? How do I write a program to do that? How am I going to structure the data?’. It’s problem solving.”
As an example, Munish tells me about a former job in which he had to extract 13 pieces of information from 100,000 unstructured customer portfolios. He built a data warehouse and designed a program that could grab the information. Once up and running, the program did the job within 30 seconds. The results fundamentally changed how the company operated.
But while Data Scientists are seriously in-demand in today’s job market – a trend that shows no signs of abatement given their significant organisational value – that hasn’t always been the case.
“I remember finishing up my studies and thinking I would just cruise into a role. Nah, did not happen,” laughs John.
“Everyone said ‘Wow these are really interesting skills… but yeah, not interested, see you later’.
“Then all of a sudden, from 2009-2010 onwards, bang – I’ve never been out of work.”
And that work has been impressive.
John’s role on a project looking at teaching skills about six years ago – before the ‘Data Scientist’ tag was even in popular use – yielded significant results and fundamentally changed the way TAFE teachers are taught.
“They were spending all this money training the teachers, but nobody had ever thought about how to actually structure the training,” he explains.
“We got all this data from teachers and we modelled it. We found out exactly which skills built upon other skills. We found out what you need to know to be an effective teacher and developed a system that could work out what form of training each teacher needed to make it to the next level.”
Munish tells me about his similar groundbreaking work on a global genetics project that looked at genetic predispositions for type 1 diabetes. It involved data from tens of thousands of people from over 100 countries, and Munish was one of the Data Scientist’s tasked with interpreting it.
“I had to build my own supercomputer,” explains Munish. “I learned to program it and structure the data, because there is no pre-defined structure in genetics. At any given point I’d have half a million jobs running on the desktop.
“I still have that computer,” he adds. “We named it Alexander.”
Why Alexander, I ask?
“Because we thought it could conquer the world.”
With the incredible range of skills required to successfully work on these kinds of projects, what road do you take to develop them? Often a meandering one, it would seem. John started off studying languages at university and was a simultaneous translator for years. “Worst job in the world,” he declares.
“I would be on all these trade missions, everyone else would go out at the end of the mission, enjoy themselves, but you are always working. It was awful.”
He moved on to a postgraduate diploma in Statistics in his late 20s, which lead to a Masters in Statistical Psychometrics. Now he’s doing a PhD in Applied Maths.
Munish took a different path, first completing a degree in Applied Mathematical Science, focusing on mathematics, implied statistics and physics. Then came a postgraduate diploma in computer science, which was where he excelled.
“It opened my eyes to technology,” he explains.
With this long list of job skill requirements, what sort of person is attracted to – and indeed suited to – the field of data science?
“I was watching this documentary the other day – Jiro Dreams of Sushi,” says John. “It’s a brilliant documentary about a Japanese guy who set up a small sushi outlet in the 1940s. He’s now 85 years old and has three Michelin stars. He’s the king of sushi. But it got me thinking – with what he does, there is so much repetition. With what we do, there is never any repetition – no two things are ever the same.
“You can’t do this job unless you have a sort of tenacity. You can imagine going in every single day, coming up with new ideas and trying to crack problems. You have to have a passion,” he explains.
“I start early and I usually don’t finish until late. I like to work on a project and get totally immersed in it, which is very easy to do in data science. I’ll work maybe 2, 3, 4 weeks and then have a break.”
For Munish, it’s his fascination with mathematics and computer sciences that have drawn him into the field and kept him there. “I just love it. It just comes naturally. I don’t know how,” he says.
“You’re either a Data Scientist or you’re not – that can’t be trained,” says John.
Being involved in a field at the technological forefront does, however, mean making a non-negotiable commitment to ongoing education.
“I reckon I would spend 80 per cent of my reading time just keeping up,” says John. There are so many emerging technologies.”
One of these emerging technologies is deep learning. It is, they assure me, going to completely change the way we live. Are they serious?
“Seriously. I am really serious,” says John. “Deep learning is going to change the way we do business in every way, shape and form.”
“Driverless cars, that is deep learning. Simultaneous translation machines, yep, deep learning.”
Deep learning is a neural net, they tell me. Around since the late 1970s, neural nets try to mimic the way in which the brain operates. Mathematically, they are set up like a brain and essentially find patterns in data.
“Neural nets can be used to look at X-rays of broken bones. In seconds, they can tell you what sort of break it is, where it is, and then send it – already analysed – to the doctor,” explains John.
Financial advisors are going to become more like guidance counsellors, they say, merely explaining to their clients what the algorithms have found.
“Neural nets are going to fundamentally disrupt the way we do business. Professions will be changed,” says John.
But while Data Scientists are finding meaning in data and adding significant value – and in many cases fundamentally changing our way of life – both Munish and John caution that there are limits to what they can do.
“Everyone thinks data science is a silver bullet. It’s not,” says John.
“I know a statistician who works with an AFL club who was hired to look at anatomical statistics in players – things like strength. Someone who cost the club a lot of money got a bad injury and the club was furious, saying to the statisticians ‘why didn’t you predict it?’
“There are limitations, the biggest of which are the laws of nature,” he says.
“One of the laws of nature is that time goes forward. So when you do a prediction, you are always driving forward, looking in the rear view mirror. Always. You can’t pierce the fog of the future. You can only look at the past.”