Tutorial on Privacy and Security Implications of Open HealthCare Data
The tutorial on Privacy and security implication of Open HealthCare Data, took place on 18th of May 2022 at the Hotel Topaz, Kandy.
Many government agencies world-wide are adopting open data as a one-size fits all approach to releasing their data. With open data, a government entity can release its data after sufficient anonymization and let other stakeholders use it. In most settings the data released in this manner fit into a macroscopic level and do not contain much information pertaining to people. When open data meets healthcare (for example COVID research), the data by nature deals with people.
Open data is considered so when it is legally and technically open. For data to be legally open, it must be available for access under an open licence. The licence that can be used to distribute, reuse the data can be constrained by the conditions under which the data was collected. For data to be technically open we need to have a well-defined sharing format that is machine manipulatable and suitable for different stakeholders to interact with the data.
One of the major challenges regarding open data is privacy. One approach would be to rely on legal frameworks. As law by nature lags technological progress, safeguards offered by the legal frameworks may not be sufficient. Another approach is to rely on a risk minimization approach where the impact of data exposure is evaluated in a multi-faceted manner with legal ramifications being one of them. One important dimension to consider is the societal impacts of data dissemination. Quantifying the privacy impacts is quite challenging. One approach that is favored is to maintain transparency regarding data use. It is helpful to develop visualization tools not only for data but also for data privacy. This visualization can be carried out by providing an accountable access to open data. That is, without publishing the open data on the web and making it available without any tracking, the data can be provided with proper follow through regarding its use by the partner stakeholders.
With accountable use of data, security becomes an important concern. Although the data is open, the collector of the data is still responsible for tracking the usage of the data. The problem is re-sharing. For instance, if the data is available for downloading in an open format such as CSV, XML, or JSON all tracking will be lost once the data is downloaded by a third party. This form of data release is not ideal for its continued usage tracking.
With the advancements in cloud computing, it is possible to setup data portals with sophisticated capabilities. For instance, a software-as-a-service model can be used to expose computations that can be performed using the data. This approach would not release the data to a third party, but all the third party to access it through software hosted in the collector’s cloud. This facility can be further improved by allowing the third party to upload their programs to the cloud servers for using the data in the collector’s cloud.
This tutorial focused into many recent advancements about this topic and discusses possible implementation scenarios for some of them.
The Speaker, Muthucumaru Maheswaran is an associate professor in the School of Computer Science and Department of Electrical and Computer Engineering at McGill University. Previously, he was an assistant professor in the University of Manitoba and a Scientist at TRLabs, Winnipeg. He got a PhD in Electrical and Computer Engineering from Purdue University, West Lafayette, USA and a BSc in Electrical and Electronic Engineering from University of Peradeniya. He has researched various issues in mapping workloads onto Grids and utility computing (Cloud) systems such as task scheduling, trust management, resource discovery, and security. Many papers he co-authored in these topics have been highly cited by other researchers in this area. He has supervised the completion of 12 PhD theses and 40 MSc theses. He has published more than 130 technical papers in major journal, conferences, and workshops. He holds several US patents in wide-area content routing, synchronization, and task scheduling. Recently, his research has focused in the development of novel programming models and frameworks for edge-oriented Internet of Things and real-time AI applications at the edge. As part of this work, he is developed an open-source programming language called JAMScript and researching many issues including fault tolerance, scheduling, and synchronization within this framework. Currently, he is focusing on the development of Space OS which will be built using JAMScript. His research has been supported by many funding agencies and companies such as NSERC, FRQNT, Mitacs, Ericsson, Ciena, and Ultra Electronics.
Tutorial Slide : Link
A Report on Privacy and Modelling of COVID Data which submitted as part of the consultancy activity to the COVID project at University of Peradeniya, Sri Lanka - Link