Research

The modern world is full of artificial, abstract environments that challenge people’s natural intelligence. The goal of our research is to develop artificial intelligence that helps people master these challenges.

Modern Artificial Intelligence has adopted a data-driven approach using machine learning techniques. The behavior and performance of the resulting systems is largely determined by the quality and quantity of data used to train them. Recent research at LIA has focussed on how to obtain and manage such data.

Game Theory for Data Science

Crowdsourcing and human computation is an essential technique for collecting data for use in machine learning. Providing data of high quality requires effort. When there is no reward for effort, there is a selection bias where only participants with ulterior motives contribute data.


In order for rewards to incentivize high-quality data contributions, the reward must depend on the quality of the data. Since the correct data is not known in the first place, this seems an impossible goal. However, it is possible to design games such that providing high-quality data is the highest-paying equilibrium strategy of the participants! In such games, participants obtain a reward when the data they contributed is more consistent than expected with that contributed by others.


We have designed several such games, starting in 2003 with simple agreement rules, evolving to automatically designed payment rules in 2006, and finally to detail-free mechanisms from 2011, leading up to the peer truth serum. A good overview of our and other groups’ work can be found in the book Game Theory for Data Science.


The same principles can be used to construct completely decentralized and transparent oracles for questions of public interest, implemented on blockchains. We have proposed the Infochain mechanism and participated in the design of the Orthos mechanism.

Federated Learning

Data relevant to a particular task is collected by many different entities: banks collect data about their individual clients, hospitals about their individual patients, and merchants about their individual customers. This data is subject to privacy regulations and cannot be shared openly. However, techniques of federated learning allow multiple data owners to form a federation and jointly learn a model while maintaining privacy of the data.


The fact that the data remains private makes it difficult to asses the quality of the individual contributions. We have developed techniques for evaluating data quality while keeping the data private. These make it possible to filter poor quality data and keep it from degrading the quality of the joint model. They also enable to pay for data according to its quality.

Privacy-preserving AI

LIA has been active for over 20 years in privacy-preserving Artificial Intelligence based on distributed computation. In 2008, we proposed the most cited work for privacy-preserving distributed constraint optimization based on the DPOP algorithm. P-DPOP provides complete privacy using multiparty computation.

One drawback of the DPOP solution is that the solution itself reveals information about the participants’ objectives. More recently, we also showed how to solve distributed constraint optimization problems using stochastic sampling with differential privacy guarantees. The advantage is that the solution also satisfies the differential privacy constraint.

Privacy-preserving techniques are of particular interest in machine learning. By obfuscating the training data with suitable random noise, it is possible to give differential privacy guarantees on the amount of information that the learned model reveals about the training data. However, in many practical scenarios the noise that has to be added for meaningful privacy guarantees is so large that the resulting model is no longer useful.

In a concept called Bayesian Differential Privacy, we propose to tailor the obfuscation to the actual distribution of the data. This means that the privacy guarantees are weaker for data that does not fall in the distribution of the training data, but in return much stronger privacy guarantees can be achieved for data that does follow the training distribution. We also showed how the same concept can be applied in a federated learning setting, and to generate synthetic data.

Intelligent Agents

Autonomous agents become ubiquitous to help people manage the complexities of the modern world. Our goal is to make such agents behave intelligently. We have a long-standing line of research in distributed constraint optimization to solve problems of coordination among distributed agents, and we have recently developed even more efficient techniques using sampling and for learning to coordinate the use of resources through a decentralized protocol.

Autonomous agents become ubiquitous to help people manage the complexities of the modern world. Our goal is to make such agents behave intelligently. We have a long-standing line of research in distributed constraint optimization to solve problems of coordination among distributed agents, and we have recently developed even more efficient techniques using sampling and for learning to coordinate the use of resources through a decentralized protocol.


Reinforcement learning is the problem of learning how to act within an unknown environment through interaction and limited reinforcement. It is one of the most general problems in AI, with applications such as game playing, resource management, optimisation and optimal control. Reinforcement learning with multiple agents is still a largely unsolved problem. We have shown how it can be solved much more efficiently when agents can observe a common, ergodic signal. By learning policies that depend on this common signal, agents break the coordination into smaller subproblems and converge to a stable solution much more rapidly.

Another issue is coordination of very large multi-agent systems, for example for allocating resources or scheduling meetings. We have developed a heurstic algorithm called Altruistic Matching Heuristic (ALMA) that allows for the first time to solve large-scale distributed resource allocation problems in constant time. Various versions have been developed, including a version that gives differential privacy guarantees for the preferences of individual agents.

Natural Language Processing


Recent research in Natural Language Processing at LIA focuses on how to generate natural language explanations for recommendations given by a neural recommender algorithm. We then showed how to use critiques of these explanations to modify the recommendations to better suit the user’s wishes. By multi-step critiquing, it is possible to navigate to items that best fit the user’s preferences.

More recently we are investigating how to apply the same idea to the output of large language models, and we have shown encouraging results that will be published soon.

In earlier work, we also considered sentiment analysis as an example for exploring how to overcome the accuracy barriers in machine learning using targeted crowdsourcing.

Earlier, we also worked extensively in case-based reasoning and in model-based and qualitative reasoning.