Linking Sensitive Data

Methods and Techniques for Practical Privacy-Preserving Information Sharing

verfasst von: Prof. Peter Christen, Dr. Thilina Ranbaduge, Prof. Dr. Rainer Schnell

Verlag: Springer International Publishing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book provides modern technical answers to the legal requirements of pseudonymisation as recommended by privacy legislation. It covers topics such as modern regulatory frameworks for sharing and linking sensitive information, concepts and algorithms for privacy-preserving record linkage and their computational aspects, practical considerations such as dealing with dirty and missing data, as well as privacy, risk, and performance assessment measures. Existing techniques for privacy-preserving record linkage are evaluated empirically and real-world application examples that scale to population sizes are described. The book also includes pointers to freely available software tools, benchmark data sets, and tools to generate synthetic data that can be used to test and evaluate linkage techniques.

This book consists of fourteen chapters grouped into four parts, and two appendices. The first part introduces the reader to the topic of linking sensitive data, the second part covers methods and techniques to link such data, the third part discusses aspects of practical importance, and the fourth part provides an outlook of future challenges and open research problems relevant to linking sensitive databases. The appendices provide pointers and describe freely available, open-source software systems that allow the linkage of sensitive data, and provide further details about the evaluations presented. A companion Web site at https://dmm.anu.edu.au/lsdbook2020 provides additional material and Python programs used in the book.

This book is mainly written for applied scientists, researchers, and advanced practitioners in governments, industry, and universities who are concerned with developing, implementing, and deploying systems and tools to share sensitive information in administrative, commercial, or medical databases.

The Book describes how linkage methods work and how to evaluate their performance. It covers all the major concepts and methods and also discusses practical matters such as computational efficiency, which are critical if the methods are to be used in practice - and it does all this in a highly accessible way!David J. Hand, Imperial College, London

Inhaltsverzeichnis

Frontmatter

Introduction

Frontmatter

Chapter 1. Introduction

Abstract

In this chapter, we show that linking individual records from different databases is indispensable for many research purposes and data usage in practical applications. Almost all analyses of Big data sources require linking several databases containing information about the same or similar populations. We discuss examples of applications from medicine, economics, and official statistics. Since the GDPR and other legal restrictions usually require pseudonymisation, the use of error tolerant pseudonymisation methods becomes necessary. Based on the increasing number of research published in diverse areas we show that the need for the techniques presented in this book is becoming more important.