A couple of months ago I have attended the Chief Data Officer Summit, organized by Corinium Intelligence. They organize a quite different type of event, not much in terms of format but rather in terms of audience.
As the name suggests, the conference does not simply deal with CDOs but it specifically targets them. In other words, it is an event about CDOs for CDOs, and you might happen to be in the same room with 60 CDOs (and CTOs) from companies belonging to any industry.
The format is quite standard though — 2 days of talks with a few discussion groups, usually organized in really nice hotels in central London.
As usual, I will now try to summarize my three personal re-elaborated takeaways from the conference. But first…
What is a Chief Data Officer exactly?
Apparently, it is a new role born in a lighter form straight after the financial crisis springing from the need to have a central figure to deal with technology, regulation and reporting.
Therefore, the CDO is basically the guy who acts as a liaison between the CTO (tech guy) and the CAO/Head of Data Science (data guy) and takes care of data quality and data management.
Actually, its final goal is to guarantee that everyone can get access to the right data in no-time.
In that sense, a CDO is the guy in charge of ‘democratizing data’ within the company.
It is not a static role, and it evolved from simply being a facilitator to being a data governor, with the tasks of defining data management policies, frameworks, procedures, and tools. In other words, he is a kind of ‘Chief of Data Engineers’ (if we agree on the distinctions between data scientists, who actually deal with modeling, and data engineers, who deal with data preparation and data flow).
“The difference between a CIO and CDO (apart from the words data and information…) is best described using the bucket and water analogy. The CIO is responsible for the bucket, ensuring that it is complete without any holes in it, the bucket is the right size with just a little bit of spare room but not too much and its all in a safe place.
The CDO is responsible for the liquid you put in the bucket, ensuring that it is the right liquid, the right amount and that’s not contaminated. The CDO is also responsible for what happens to the liquid, and making the clean vital liquid is available for the business to slake its thirst.” (Caroline Carruthers, Chief Data Officer Network Rail, and Peter Jackson, Head of Data Southern Water)
Interestingly enough, the role of the CDO as we described it is both vertical and horizontal. It spans indeed across the entire organization even though the CDO still needs to report to someone else in the organizational chart. Who the CDO reports to will be largely determined by the organization he is operating in.
It is also true that not every company has a CDO, so how do you decide to eventually get one? Well, simply out of internal necessity, strict incoming regulation, and because all your business intelligence projects are failing because of data issues. If you have any of these problems, you might need someone who pushes the “fail-fast” principle as the data approach to be adopted throughout the entire organization, who consider data as a company asset and wants to set the fundamentals to allow fast trial and error experimentations.
A CDO is then the end-to-end data workflow responsible.
Finally, if the CDO will do his job in a proper way, you’ll be able to see two different outcomes: first of all, the board will stop asking for quality data and will have clear in mind what every team is doing. Second, and most important, a good CDO aims to create an organization where a CDO has no reasons to exist.
It is counterintuitive, but basically, a CDO will do a great job when the company won’t need a CDO anymore because every line of business will be responsible and liable for their own data.
In order to reach his final goal, he needs to prove from the beginning that not investing in higher data quality and frictionless data transfer might be a source of inefficiency in business operations, resulting in non-optimized IT operations and making compliance as well as analytics much less effective.
I. The Artificial Intelligence hype is real
What it was clear to me attending this conference is that many big companies are not even thinking about AI because they have much more relevant problems to solve first.
When you are still dealing with merging different data silos or understanding who is in charge of what data, implementing machine learning algorithms is the last of your concerns.
This would translate to me into a two-fold classification: companies from the pre-information era and companies of the post-information age.
Roughly speaking, I think about the pre-information era as the age before Google came along and the post-information era about the time after the company was founded. Even though this classification is not strict, companies founded or with businesses belonging to the pre-information age have a data lineage which is really hard to deal with. These are the companies who actually employ a CDO and are the vast majority out there.
Companies of the post-information era (Facebook for instance but also every startup out there) don’t need a CDO because they started their business dealing with data since inception (and thinking about it with a long-term view). It is not a case these are the ones who implement AI with more ease.
With this basic classification in mind (and thinking that the post-information era created much more services rather the products companies) you might draw your own conclusions on AI hype.
This is also why innovation is so hard for pre-information era companies, and they feel the struggle between the difficulty of developing internal capabilities (and the need then to acquire them from outside) and the impossibility of doing that organically because of the lack of integrable systems and processes. Counterintuitively, post-information era companies (which in theory should need to innovate less than their pre-information peers) are the ones who are pursuing the more aggressive acquisition or expansion strategies.
II. Data privacy, governance, and quality matter
There are two important concepts to be considered from a data protection point of view: fairness and minimization.
Fairness concerns how data are obtained, and the transparency needed from organizations that are collecting them, especially about their future potential uses.
Data minimization regards instead the ability to gather the right amount of data. Although big data is usually intended as “all data”, and even though many times relevant correlations are drawn out by unexpected data merged together, this would not represent an excuse for collecting every data point or maintain the record longer than it is required to.
Furthermore, no matter how strong the data privacy would be, people may not be open to share data for different reasons, i.e., either lack of trust or because they have something to hide. This may generate an adverse selection problem that is not always considered because clients who do not want to share might be perceived as they are hiding something relevant (and think about the importance of this adverse selection problem when it comes to governmental or fiscal issues).
Private does not mean necessarily secret, and shared information can still remain confidential and have value for both the parts — the clients and the companies.
In practice, most of the time spent by any CDO is indeed on data governance and privacy. Although there exists organization which are implementing a ‘privacy by design approach’, many companies need to deal with their data and organizational legacy (and it is not by chance that companies using a ‘privacy-by-design approach’ are the ones belonging to the post-information era).
Why would you go with that approach? Of course because it is less costly for the company itself.
III. Paradigm shift for personal data
One of the most interesting ideas I heard was about new ways to drive trust and growth about personal data proposed by Stephen Deadman (Chief Privacy Officer at Facebook). He thinks that creating mechanisms for building trust, transparency and control is essential to deliver most impact. Here it follows the list of these five shifts he proposed:
Shift #1: From education to confidence
‘Consumers are not going to understand every business model, or everything the government does. But there are mediators who need to know, like Parliament, journalists, and consumer groups” — Jeroen Terstegge
Shift #2: From partial value to full value
“I think you need to see the relationship with customers as a partnership. Partnership implies joint ownership of data, education, joint decision-making on what is being done with data” — Igor Ostrowski
Shift #3: From restrictive to enabling
“The regulator and the industry should talk more, and it’s also important to listen to users. We need to find the right balance between what users want and what public protection needs” — Ignacio Gonzales Royo
Shift #4: From compliance to sustainable customer relationships
“Companies need to rebuild trust by redrawing lines. Showing customers the lines they will and won’t cross” — Gesa Diekman
Shift #5: From good intentions to good outcomes
“By definitions, regulators are not very innovative. That’s why we need to turn to companies to come up with innovative answers and solutions that provide meaningful user controls” — Henry Chang
Those shifts would help to create a ‘sustainable data environment’, which could be defined, in Stephen’s words, as:
i) People using data driven-services feel confident, and that the value exchange is fair;
ii) Policymakers and regulators have a united agenda: to maximize the benefits of data while minimizing the harms;
iii) Organizations are visibly demonstrating their responsible and accountable practices;
iv) Solutions to concerns that arise are designed and iterated to be effective and reflect the realities of human behavior
Waiting for the next data event…many more conferences coming soon, so stay tuned!