Why facilitating data sharing is of great importance to Artificial Intelligence

Why facilitating data sharing is of great importance to Artificial Intelligence

This content was created by the Data Sharing Coalition, one of the founding partners of the CoE-DSC.

The Dutch AI Coalition is a public-private partnership of 350+ government, industry, educational and research institutions and civil society organisations. All have committed themselves to accelerating the development of Artificial Intelligence (AI) in the Netherlands and connecting AI initiatives in the country. On August 27th, in the upcoming Community Meeting of the Data Sharing Coalition, the AI Coalition will give a presentation on the main findings of their report “Verantwoord datadelen voor AI” (“Responsible data sharing for AI”). In advance, we spoke with Frans van Ette, coordinator of the AI Coalition’s working group Data Sharing.

“The development of AI has gained momentum worldwide. AI makes fundamental changes possible in all kinds of areas with major consequences. Take, for example, the healthcare sector, where AI can detect lung cancer better than specialists themselves on the basis of image recognition,” says Frans. “Since AI has a huge impact on all business sectors, our private lives and society as a whole, the Netherlands needs to contribute to its further development. This allows us to determine how AI is applied for social and economic challenges, maintain and expand our international competitive position and help solve economic and social issues. Since there is no way back to a world without it, investing in AI is not a choice, it’s a must,” says Frans.

More data means more knowledge

“When AI systems gain more access to data, fundamental changes in the healthcare sector and in plenty of other domains and sectors will accelerate,” says Frans. “After all, more access to relevant data is crucial for the development of AI. With AI machine learning, a system learns based on experiences. Those experiences are based on data: things that have happened and have been recorded. Based on this data, an algorithm can predict and classify outcomes much better and improve itself. In other words: more access to data leads to an acceleration of AI ​​implementation and higher accuracy, which, in the end, results in a better service for the consumer. This is also why sharing data is so important; by doing that, AI systems will automatically have more and better access to relevant data.”

The reliability of data is even more important for AI

“With AI systems – even more than with systems in which AI is not applied – the reliability and quality of data is crucial,” says Frans. “The more reliable the data, the better the predictions, conclusions and recommendations the system can make. After all, an AI application based on machine learning is dependent on data for making decisions. A well-known example of how this can go wrong is the AI tool that Amazon used to recruit new employees. Through a bias in the data of the past, the system had taught itself that male candidates were preferred over female candidates in senior positions, simply because male candidates were over-represented in the management layer of the organisation. When data is shared to be used in an AI system, it is essential that the quality of the data is thoroughly checked before it feeds the system and wrong conclusions are drawn based on that data.”

Choose an integrated approach to facilitate data sharing

“Many organisations still lack knowledge about what is possible with AI; and unknown makes unloved,” says Frans. “Whether or not you share data for AI purposes, organisations are often hesitant about sharing their data in general because of the consequences. For both legal and economic reasons. For example, how do you deal with sharing personal data? This is an even more sensitive matter within  AI because new insights are gained on the basis of that data. This also applies to competition-sensitive data: the correlations AI discovers possibly make the economic risks of sharing data even bigger. That is why an integrated approach where, apart from technical aspects, organisational, business and legal aspects are taken into account as well, is very important to facilitate data sharing,” Frans emphasises.

Two coalitions that can learn from each other

Both the Data Sharing Coalition and the AI ​​Coalition endorse the importance of data sovereignty and interoperability. Frans: “Given the greater possibilities that AI provides to extract value from data, data sovereignty is an important principle, enabling consumers and organisations to be in control over the data they share. Besides, for reliable data sharing and the avoidance of vendor lock-in, interoperability is key. The basic principles for sharing data for both the Data Sharing Coalition and the AI Coalition are the same and we apply the same building blocks to facilitate data sharing. Although our starting point might differ, it is extremely valuable that knowledge is shared to accelerate data sharing.”

On August 27, the AI ​​Coalition will give a presentation during the Community Meeting in which it will further discuss their report “Responsible data sharing for AI”. Do you want to attend? Please send us an email: info@coe-dsc.nl


Read more

The benefits of combining data spaces and Privacy Enhancing Technologies

The benefits of combining data spaces and Privacy Enhancing Technologies

Data spaces and Privacy Enhancing Technologies have a common goal: making insights from data accessible in a confidential manner. But the development of both is driven by two different communities. This must change. By applying PETs within data spaces, confidentially exchanging insights from (privacy sensitive) data becomes more scalable.

White paper: Unlocking the Potential of Data Spaces

White paper: Unlocking the Potential of Data Spaces

Parties interested in deploying a data space need to use the right technologies and need to make sure they get the business and governance of the data space right. This is easier said than done, because there is relatively little guidance on how to deploy a data space successfully. What guidance can be given? On behalf of the CoE-DSC, a white paper has been written about this topic by Gijs van Houwelingen et al., TNO.