A Painful Business

In a recent interview with Vinod Khosla, Sergey Brin bemoaned the regulatory burden of health care in the United States (video):

Imagine you had the ability to search people’s medical records in the U.S. […] Any medical researcher can do it. Maybe they have the names removed. Maybe when the medical researcher searches your data, you get to see which researcher searched it and why. I imagine that would save 10,000 lives in the first year. Just that. That’s almost impossible to do because of HIPAA. I do worry that we regulate ourselves out of some really great possibilities that are certainly on the data-mining end. […] Generally, health is just so heavily regulated. It’s just a painful business to be in.

This view is in keeping with an earlier interview Larry Page gave to Farhad Manjoo at the New York Times, but Brin has toned down the numbers by an order of magnitude. Apparently at some point between 25 June and 3 July, Google discovered that some 90,000 people annually would not, in fact, be saved by data mining.

I believe that Page and Brin have a good point — HIPAA does create a chilling effect for data sharing, as we argued at a workshop in 2011. We attributed the meager flow of information to lack of infrastructure for persistent governance of healthcare information, which, under HIPAA, makes sharing data with a third party a risky proposition.

Of course, one way to address privacy issues is to keep the scale limited, focusing on a smaller set of individuals who have consented to radical transparency of their data. This is the approach that Google appears to be taking with the Baseline Project, an ambitious initiative to extensively quantify healthy human beings. The Baseline Project will collect a variety of healthcare information on 175 individuals, including their genomic data. Assuming that the participants have given blanket consent, this will be a tremendously useful resource, but I predict that problems will begin to emerge as the project scales up and the scope expands. Statistical power requires large numbers, and it is not obvious that such large numbers of people will be willing to grant carte blanche access to their data.

I believe that the designers of the study recognize the risk; according to the Wall Street Journal article:

[…] the idea that Google would know the structure of thousands of people’s bodies—down to the molecules inside their cells—raises significant issues of privacy and fairness. In the future, this kind of data would be invaluable to insurers, who are always looking to reduce their risks. And more prosaic but chilling uses, such as prior to job interviews or marriage proposals, lurk in the background. Baseline will be monitored by institutional review boards, which oversee all medical research involving humans. Once the full study gets going, boards run by the medical schools at Duke University and Stanford University will control how the information is used. “That’s certainly an issue that’s been discussed,” said Dr. Gambhir. “Google will not be allowed free rein to do whatever it wants with this data.”

…no doubt to the frustration of Brin and Page. Even in this relatively small study, with fully consented individuals, the researchers recognize that there must be policy constraints on the data miners. How will Google square this circle? Can data-driven hypothesis generation and privacy coexist?

We believe that they can. The first step is to stop thinking about the choice between privacy and access as binary — that one can only be attained at the expense of the other. As we have argued before — and are demonstrating with Genecloud — it is possible to reconcile these two seemingly contradictory requirements by governing the analytical programs through which users interact with sensitive information. When analytics and data interact in a governed environment, it is possible to minimize collateral disclosure of information and audit that which ultimately is disclosed.

Google would do well to acknowledge that privacy is a legitimate concern in healthcare, not just an inconvenient roadblock to be circumvented or eliminated. Privacy regulations like HIPAA, as imperfect as they may be, at least recognize these concerns and attempt to address them. Organizations that fail to take privacy seriously will continue to find healthcare nothing but a painful business.

Leave a Reply