2 Federated analytics
Where we explain the basic concepts and principles of federated analytics
2.1 Introduction
A survey on federated analytics (FA) defines it as “a paradigm for collaboratively extracting insights from distributed data that is owned by multiple parties (e.g., individual mobile devices or institutional organizations) under the coordination of a central entity (e.g., a service provider) without any of the raw data leaving their local parties or revealing information beyond the targeted insights. The core principles of this paradigm allow breaking the limitations for deriving analytics from limited centralized data, in terms of privacy concerns and operational costs.”1
Discerning characteristics:
- Apply some form of statistical disclore control (SDC) and/or privacy-enhancing technologies (PETS). Note we SDC can include differential privacy (DP), but we take a more generic approach that we want to protect the raw data as such, for example to provide guarantees to data holders with respect to commercial interest, trade secrets and the like. data privacy/security through PETs. Practically, all approaches to FA should have output control
- governance on the computation and not on data access as such

2.2 Key resources
Monographs, very detailed:
- Federated Learning Systems (2021) by Rehman and Gaber2
- Federated Learning (2022) by Ludwig and Baracaldo3
Review papers
Applications in healthcare
- Rieke, N. et al. (2020)5
- Joshi, M. et al. (2022)6, Fig 2 is nice diagram explaining horizontal, vertical partitioning and transfer learning
Hands-on - vantage6 workshop (link)