Q&A with: BranchKey

This content was created by the Data Sharing Coalition, one of the founding partners of the CoE-DSC.

The Data Sharing Coalition supports organisations with realising use cases at scale to exploit value potential from data sharing and helps organisations to create required trust mechanisms to share data trusted and secure. In our blog section ‘Q&A with’, you learn more about our participants and their thoughts, vision and ideas about data sharing. Robin Schut, co-founder at BranchKey, shares his thoughts.

1. Could you introduce your organisation?

BranchKey is a young start-up and Platform-As-A-Service company focused on providing Federated Machine Learning (FedML) technology. FedML is a technology that connects a distributed network of machine learning models and finds the optimum model using information from each individual model. On our platform, organisations can collaborate with each other and together train Artificial Intelligence models in a secure environment. These organisations use federated learning to gain insights from external data sources and share those without exposing data, especially sensitive data. Data that is used to train an AI model remains in its location, but the insights of those models are shared with other organisations that they choose to collaborate with.

It is our mission to facilitate the collaboration on AI development by allowing our users to share intelligence and knowledge through AI models on our platform. We do that by standardising model architectures and data requirements and providing the infrastructure for information to be shared between parties. We hope that by collectively training AI models together, many different viewpoints will be taken into consideration. We believe this is the responsible way to develop AI.

It is our vision that everyone in the world is responsible for the development of AI. Artificial Intelligence is a powerful and promising technology that can have an impact in ways that were previously hard to imagine, as we’ve seen in 2016 when Lee Sedol was beaten by Google DeepMind’s Alpha Zero. Therefore, it’s in society’s best interest to democratise the development of AI applications to make sure the development happens in a controlled and responsible way.

2. To what extent is your organisation involved in data sharing (within and across sectors)?

BranchKey is an infrastructure provider of FedML technology. Organisations can use our technology to learn from data sources within and across sectors by collaborating with other organisations.

Companies are actively looking for external data sources to train their AI model, but they are often held back by privacy regulations such as the GDPR and governance, both internally and externally. Federated machine learning can solve these issues. Think of an AI model as a network of nodes and connections, often referred to as model parameters. To train an AI model, you need to show the model data and a set of instructions to learn from that data. During the training phase, these connections strengthen depending on the data. It’s exactly these connections that you’re interested in, not the data itself. Our platform does not share copies of data, we share model parameters. If the goal of your data sharing initiative is to jointly train an AI model, then Federated Machine Learning is a great fit.

Outside of the Data Sharing Coalition, we are currently involved in a use case in the energy sector in which an AI model is used to predict the future energy demand of several buildings. By acting on these predictions – for example, by sending instructions to the central heating system to turn on/off at advantageous hours – a lot of energy consumption can be saved. All the buildings have full data autonomy, which means the data can’t leave the building. This way, the building owner stays in control over his/her own data and prevents another company from accessing it without permission. The added value of federated learning is that the AI models can learn from patterns that emerged at other buildings. Hence, the predictive model can generate better predictions, which in turn leads to more energy savings.

We see a future in which data is made available to train AI models in a safe and responsible way.

3. Why is or should sharing data be important for your industry or domain?

There is a famous saying in the field of Artificial Intelligence which states that a model is only as good as the data it sees (“Garbage in, garbage out”). Training a model on just one data set can lead to a wide range of problems. One problem is that biases can occur in models because they are trained on a data set that contains a predominant feature. An example of the negative effects of this bias, is categorising an individual based on ethnicity or gender. To overcome this problem, AI models need to be enriched by a lot of different data sources. The more data an AI model sees, the less bias the model develops which consequently leads to the least chance of encountering negative effects.

4. What are the most promising data sharing developments and trends you see in your sector?

Many initiatives are taking place to govern how AI models are trained (for example to minimise or eliminate negative effects). These initiatives can range from standardising data formats to open-sourcing data. For example, the scale-up Huggingface.co publishes a lot of open-source data and benchmarks AI models. Another promising development when it comes to the concept of bringing data to the algorithm is the counterpart of traditional training methods. In order to do this, you physically need an export of a data source on the server where the AI model resides as well. However, new developments are emerging where algorithms are brought to the data source and complex computation tasks are pushed out to the edge, rather than everything being centralised in the cloud. Our technology can be categorised under the latter.

5. How do you see the future of data sharing, and what steps are you currently taking in that direction?

In our field, we hope to see that federated learning and other privacy preserving technologies will be widely adopted across many sectors. We see a future in which data is made available to train AI models in a safe and responsible way. BranchKey is working hard to provide scalable infrastructure for companies so they can work together on AI models. However, we are only one piece of a very large puzzle. We would like to see many other companies engage in responsible data sharing and hope that other start-ups will also come up with new innovative technologies in this domain.

6. Why are you participating in the Data Sharing Coalition?

We are participating in the Data Sharing Coalition to learn from many different sectors what their reasons are to share data. Furthermore, we are participating to learn from use cases and learn about frameworks that are needed to engage in a data sharing endeavour. By listening to participants at Community Meetings and talking about experiences from other participants, we can improve our own services and platform.

An update on the data and cloud developments in the EU and the Netherlands

We spoke with Bjorn Hakansson, cloud lead at the CoE-DSC and Senior Business Development Manager at TNO on the status of cloud developments in Europe and the Netherlands specifically.

Summaries and slides of CoE-DSC Community Meeting on Gaia-X and GXDCH

On September 25th 2024 CoE-DSC organized a Community Meeting on Gaia-X and Gaia-X Digital Clearing House (GXDCH). Here you can find the summaries and the pdfs of the PowerPoints.

Minister van EZ bevestigt rol TNO en CoE-DSC op het gebied van data delen

In de Tweede Kamer werd op maandag 30 september 2024 gedebatteerd over de Uitvoeringswet Data Governance Verordening. De minister van Economische Zaken Dirk Beljaarts beantwoordde vragen vanuit de Kamer. Hij gaf daarbij aan dat hij een belangrijke rol ziet voor het CoE-DSC bij het ondersteunen van de ontwikkeling van data delen in Nederland.

Osaka 2025 Hightech and Digitalization online information session – 18 October 2024

We invite you to join an online information session on Friday 18 October 2024 for the upcoming Mission to Japan on Hightech and Digitalization (HTDX) on the occasion of the