Vassilis Christophides

Presentation - Vassilis Christophides is a Professor of Computer Science at the University of Crete. He has been appointed in 2015 to an advanced research position at INRIA-Paris. Previously, he worked as a Distinguished Scientist at Technicolor, R&I Center in Paris. His main research interests include Databases and Web Information Systems, as well as Big Data Processing. His current research work at INRIA focuses on Network measurements and IoT data analytics. He has published over 130 articles in high-quality international conferences and journals. He has been scientific coordinator of a number of research projects funded by the European Union and the Greek State. He has received the 2004 SIGMOD Test of Time Award and 2 Best Paper Awards (ISWC 2003 and 2007). He served as General Chair of the EDBT/ICDT Conference in 2014 and as Area Chair for the ICDE “Semi-structured, Web, and Linked Data Management” track in 2016. He has also co-authored a book on entity resolution in the Web of data.

Research project 

Big data management promises to bring a significant improvement in people’s lives, accelerating knowledge discovery, research and innovation. However, in the last few years, there is an increasing concern regarding the lack of fairness (leading to bias), diversity (leading to exclusion), and transparency (leading to opacity) of data-driven algorithms supporting decision-making, raising a call for responsible data-driven decision making by design. So far, efforts for responsible decision making have mostly focused on Machine Learning algorithms, assuming that they have been trained on high-quality data, ignoring the underlying complex pipelines that may have produced such data. A core data pipeline for producing such data is entity resolution (ER), which discovers and unifies descriptions that correspond to the same real-world entities.

In this project, we target ER systems that are responsible by design, in particular when decisions about which entity descriptions should be resolved first need to be made with respect to a given budget. The objectives of ReponsibleER are: (a) to enrich the diversity of resolved entities, (b) to ensure fairness of resolved entities, and (c) to enhance the transparency of ER systems. For (a), we are interested in formalizing progressive ER as an optimization problem in which we are not simply interested in maximizing the number of resolved entities for a given budget, but the diversity of the entity graph resulting after merging the matching descriptions. For (b), we are interested in measures of centrality capturing the popularity of matching candidates in an entity graph processed by a progressive ER algorithm, then ensuring that all popularity groups are fairly represented in the results. For (c), to enhance transparency, we need to provide meaningful explanations regarding the intermediate decisions taken throughout an ER process (e.g., indexing, matching). Currently, there is no existing work targeting responsible ER by design.