Well, let me start by saying it is really weird to go to an event about Big Data, AI, and similar topics in Milan. In spite of the presence of some data-driven organizations and super smart individuals, Italy is not known as a homeland for data scientists, as well as it is not the first country which pops up in innovation rankings. Things are slowly changing (see what Diego Piacentini is doing with his Digital Team) and this event is also a good signal for the country.
So, brave choice to organize the BA & Data Management Forum in Milan — but a good one, because it gathers many specialists that you don’t see often at this type of events.
Pretty standard format, a two-days event with ca. 10 talks per day as well as a couple of panels. There were some presentations that I was really impressed with, so I try now to summarize a few aspects of those talks in the form of personal takeaways.
I. Creating a data team is a custom process
There is not a single recipe for assembling a data science team. I have discussed a lot about what to look for when hiring data scientists and how to assess the big data maturity stage of a company, but I found out during the conference that there are many more nuances and issues to consider:
- Budgeting problems: it is hard to justify the expense for a full data science team in advanced. C-level management is often skeptical about the value of data analytics, and it would hardly allocate proper budgeting to hire 4–5 data scientists at once (this is in my opinion the minimum size of a DS team). Bringing in data experts who will internally train people is a good solution for starting with analytics capabilities. A second budgeting problem concerns the projects themselves: not all the big data projects can be funded, so projects prioritization and KPI monitoring are essentials;
- Integration problems: it is not so easy to understand where a data science team should seat. It is a good idea to have an external Center of Excellence (CoE), physically and operationally separated from the business. However, this could create internal integration problems for the organization, which can be weakened by hiring internals in a first place. If from one hand they may need extra training to be brought up to speed with big data tools and skills, they may also attenuate the communication problem with other departments and bring a solid knowledge of the business and the existing processes. Extra value point: look for different backgrounds, even internally;
- Governance problems: data governance and organizational governance are essentials for a correct development and deployment of data science projects. If from one hand data governance is really relevant for security and privacy issues, company governance is the reason why DS team fails or succeeds. Clear policies and a single top executive to lead the effort of the team (CEO, CFO and CTO are the most common choices) are ways to reduce this class of problems.
‘Governance over data is paramount to empower business units and guarantee a fair and cost-efficient approach (Davide Cervellin, Head of EU Analytics, eBay)’
- Cultural problems: data science team should adopt by definition a startup culture: they should be fine with failing, they should know how to deal with uncertainty, and they would need to act according to open and transparent processes. The team should follow an agile approach and work across teams and hierarchy layers. This creates a cultural clash within big organizations, and therefore the team leadership has the burden to smooth this issue setting clear goals for the team and establishing collaborative relationships with the rest of the company. They should also work hard in managing correctly the expectations as well as fighting the company resistance to change;
- Data problems: I leave it as last because it is obvious: there are ALWAYS data issues, in any organization. They might be represented by sealed data silos, or by dirty/messy datasets, or simply by mismatches between technologies, data, and goals. Sorry, there is not an off-the-shelf solution to be implemented as it is, and this is why a DS team is so important.
II. Data science is a new paradigm shift
I believe that the real value that emerging technologies have (e.g., big data, AI, machine learning, etc.) is in changing two completely different models: our research frameworks and the existing companies’ business models:
- Research Models: our thinking process is historically based on the scientific method — observe, hypothesize, deduce, experiment, and synthesize — but I believe that big data and AI are taking some of this steps off from the game. Actually, it can be it can be visually summarized in the following figure, which distinguishes between traditional ML (which follows the scientific method step by step) and new deep learning flow:
Even if everything sticks to the ‘I-measured-them-all’ framework (where you measured every single thing/action), the research paradigm is also shifting from ‘encoding uncertainty in assumptions’ to ‘modeling uncertainty and question initial assumptions’. This somehow is helping to manage the ‘black-box explainability’ problem.
‘Be fair with the data: let them ask you questions’ (Carlo Torniai, Head of DS and Analytics, Pirelli)
- Business Models: big data and AI also created new business models, as showed by Natalino Busa, Chief of Data Science at Teradata (see his presentations here):
Particularly interesting is the ‘Ephimeral Computing’ model, which is an example of data-platform-as-a-model (dPaaS). This model stores data for super short time and it is highly suitable for data exploration (it works entirely on the cloud):
This is only one example of new business models, which in turn is creating new ways to monetize big data.
From a pure business perspective instead, big data projects can be classified into four different cases (Dr. Scorbureanu’s classification):
- Direct Evidence: this is the result of a business-focused PoC, useful to get familiar with big data technologies;
- Initiative Optimization: this case leverages a ‘transformational program’ to fund big data analytics projects for organizational processes efficiency;
- Business Impact: it leverages improvements in business processes without cannibalizing funding from special programs;
- Corporate Asset: this case is more radical and fundamental because it uses big data to stay competitive or to gain competitive advantages within the industry.
III. There are sectors and areas that are hotter
It seems clear that there are areas and sectors in which big data and AI are having an immediate impact.
Natural Language Processes is, by all means, the most active area of research. I have already written a bit on the topic, drafting also a landscape for companies specifically working on speech recognition.
‘The state-of-art of speech recognition today has raised a lot since 2012, with deep-q networks (DQNs), deep belief networks (DBN), long short-term memory RNN, Gated Recurrent Unit (GRU), Sequence-to-sequence Learning (Sutskever et al., 2014), and Tensor Product Representations (for a great overview on speech recognition, look at Deng and Li, 2013)’ — Full article HERE.
Insurance and banking (and finance overall), are likely two of the sectors most affected by big data and AI by definition. In the insurance industry specifically, we are observing applications to several use cases, as for instance claim processing, customer engagement, telematics and underwriting (for the full list of use cases and startups in the space, see here).
Finally, there is an old player who is looking at the big data game with new and fresh eyes, i.e., the Smart Government. Singapore and Dubai (UAE) seem to be the most advanced examples of this emerging space (called smart cities or smart nations), and it might be useful to look at one example of an integrated platform they work with:
IV. Final Thoughts
There are few final general hints I would like to point out which I took away from the conference, which are misconceptions and mistakes that occur within data wannabe-organizations:
- Clearly define your strategy: defining the target is not synonymous of defining your strategy;
- Experience is different than age: trust young generations more!
- Communication is key: communicate and make visible (internally and externally) what the DS team is doing;
- Focus on business: all these technologies may take you away from your final goal, which is doing business. So, remember to be a ‘business-driven tech-enabled’ company.
Waiting for the next AI event…many more conferences coming soon, so stay tuned!