Data Science at Scale in Organizations: Enabling Safe Access to Sensitive Data

The data revolution is transforming how people live, executives manage operations, and businesses deliver goods and services. A key part of this transformation is the sharing of data and knowledge. Yet researchers working with sensitive data are limited in their access and their ability to collaborate, largely because of organizational and technical impediments. Our goal is to build computational research infrastructures based around Jupyter that enable and empower organizations to provide safe and secure access to sensitive data at scale.

The final goal for this project is to build upon existing successful infrastructure projects such as JupyterHub to address the core privacy problems organizations face: summarized as ensuring that the Five Safes are addressed in computational research with sensitive data: safe people, working on safe projects, in safe settings, access safe data and release safe outputs.


Alfred P. Sloan Foundation


Cal Poly, NYU, UC Berkeley



Scientific Computing


September 2019-May 2021