Iacopo Ghisio, Head of Artificial Intelligence and Machine Learning Dept. at Gruppo MutuiOnline
As most of you probably know, Data Science is not a new discovery in the world of Innovation. Artificial Intelligence, Machine Learning, Advanced Analytics are just rebranding of science born more or less 100 years ago but tied up due to lack of computational resources. Thanks to a recent, last decade, boost in hardware resources, Data Science now has its deserved success.
What companies are seeing since then, it’s a ramp-up in reputation and salary of all those people smart, lucky, passionate, and competent enough to manage and build their career in such a complex ecosystem of tools and technologies that enable Data Science at the corporate level.
I am probably going to say something that happened to most of you that had the duty of hiring team members if I say that it’s common to find or being offered for a “highly skilled technical profile active in DS world willing to change for an opportunity.” That’s it! Those profiles are quite a few compared to the need, hence either you’re going to have the right budget, or you won’t be able to hire them; only, later on, you will realize that it’s not a good fit in your organization or the results are not in line with expectations. Not because of the person’s real skills (that by the way you have to be able
to assess during the interview), not because the technology is not fitting your IT department, not because the head of the department is not able to make this person effective in his job but just because your company is unique.
You have your processes, teams, and operational procedures already working like a swiss clock, but you probably do not have either data in the right format or data at all. Data Scientists mostly need data to work well, and you need to provide them. Ok, back to the beginning, a Data Scientist sometimes is considered a unicorn because of the skill set required for hitting the ground running and save the day and the budget.
Data Scientist sometimes is considered a unicorn because of the skill set required for hitting the ground running and save the day and the budget
A data scientist needs to:
• Understand the process in which she will act and that most likely she will change
• Understand the business that will be the primary client demanding results
• Use a technology that will fit into the company’s IT department without too much disruption
• Be able to provide self-standing, low maintenance, high performing software solution
• Oh yes, crunch the data and create performing ML models in few times in order to be a market first mover
In short, a data scientist should have a background in Math/Physics to build the model, an IT architect to turn it into a performant solution, a lean expert to review correctly the overlapping process, and a business consultant to turn a mathematical result into a business suggestion. I’m sorry if this isn’t the unicorn Wikipedia definition, but anyway looks like it’s difficult to find it; we’re back again to the beginning since you need to pay a lot for rare goods!
The above was my fresh start in the current company! To build a data science department with very few operative knowledge about it but with big expectations in terms of results: yes, I faced all the issues I described. Since I’m still here, I probably found one of the ways to succeed. I hope you can benefit from them:
You will most likely not find a person expert in your own business, so don’t make this blocking. You just need to have an organization flexible enough to welcome a new member, teach the basics to survive, and remove fences from other teams’ backyard.
Your processes will need to change so accept it and do not be afraid of change; data must be collected correctly, and probably you are not doing this right unless you already have a data expert inside; results must fit into processes, and probably yours are not ready for a machine to provide them.
Unless you’re a startup, your technology is probably outdated, so be ready to adopt a new one. In data science, you have two choices: Python or R. Some years ago, just one had the correct set of tools to be production-ready, and since “it works on my machine” it’s not an option; I had really just one choice and was Python. Nowadays, it’s possible that also Java, .NET, or other languages will have a surrounding framework rich enough to not start models from scratch but pay real attention to the community: you can’t afford writing all from scratch.
Writing models (and understanding them) is surely necessary, but a background in coding with production quality is important; a suitable trade off is to accept lees knowledge in core data science for a good understanding of software engineering. Finally, yet importantly, the ability to explain and talk to a different audience is key in succeeding: at least one person in your geeks’ team need to have communication skills.
Seniority is needed but not mandatory for all team members; you can grow your team in the same timeframe that a senior person will be productive in the company. Later on, you will have more effective bandwidth