Data, Ownership & Plankton

“Data is the new oil” proclaimed an Economist cover a few months ago; and if data is oil, then healthcare data – often already partly codified and structured – should be the Brent Crude – light, sweet and profitable – of all data. Recent revalations about Facebook and data sharing have made people more aware of how valuable their data might be, and how widely it is shared. The search for more – and better data – continues, and so we now have Data Fracking.

Healthcare data is often seen as being “special” – there are legal restrictions on sharing it, healthcare providers have whole departments devoted to Information Governance, and most of us accept that information about your health is different to that held about your shopping habits. One of the ways that we try and make people feel comfortable about this is by saying that “patients own their own data”. This feels comfortable, but I think it is wrong, and as a consequence we need to rethink how we control the use of data, and more importantly, the benefits that accrue from use of that data.

I need to start by saying that this is not meant as a normative view (i.e. I am not saying what should be the case) I am just saying what is the case. I would also stress that my view is a partial one – informed by my practice and by my experience working in both the NHS and academia. As a result, it is a fairly UK-centric view, although many of the points apply more widely.

Let’s start with some basics. Do patients own their data? As I suggested above, I don’t think they do. I think patients have the right to see their data; I think they have the right to ask for the data to be correct, upto date, etc. but I don’t think they own it. This is based on two things:

  1. Patients don’t maintain that data. If patients truly owned their data, then it would be their responsibility to maintain that data – to check it for errors, to make back-ups, to bring it to appointments. Current practice is a long way from this, and even tools that do allow for patients holding data and recording their own data (e.g. Patients Know Best and My Clinical Outcomes) are far from being a complete medical record.
  2. Quite a lot of the data in the record isn’t generated by the patient. Although some of it is either directly patient generated (e.g. pain scores) or is a simple recording (e.g. weight/ height) much of the data is interpreted (“Cardiac-sounding chest pain”), much was acquired by the provider from the patient at some expense ( MRI scans) and some was actually generated by the provider (radiotherapy plans). It is not clear in what sense patients “own” these data – patients don’t have the tools to hold some of the data, and some of it clearly came from the provider. A simple answer of purchase (i.e. patients “buy” their radiotherapy plan from me) doesn’t make much sense – the plan is of little value without the associated infrastruture.

I think the main reason that the “patients own their data” mantra has taken hold is that it provides a convenient shorthand for how we use and respect patient data; but in this case the shorthand hides the reality. So, if patients don’t own their data, who does? The obvious answer would be to say that the provider does. They meet some of the requirements above – hospital invest a lot of money in IT, information governance, backing-up data systems, etc. This is better, but is still flawed – at best, providers “own” the data subject to very large restrictions on its use. In fact, it stems from the idea of “ownership as protection”. Although this is attractive – this is my bicycle, and that is what gives me the right to stop you from taking it – in fact ownership is at best a partial protection. The state has the ability to overcome those rights, either through the police, or through compulsory purchase (eminent domain). Indeed, the government has a very clear right to issue a compulsory purchase order, buy your land and then charge you for access to it – train passengers don’t get a discount if the land they cross on the train used to be theirs. The other consideration is that as these are digital data, we can both own them – if you take my bicycle, I can’t use it; if you take my MRI scan, I can still look at it.

Most of the time this problem has been ignored. We haven’t had the ability to aggregate large amounts of data, and when we have done, it has been done by large governmental bodies. But that is changing – the major data brokers are now all private sector companies, and there is a clear route to financial gain from data aggregation. Data is only really of value when aggregated and refined into information and knowledge. The same is true in the oil industry – the real value of oil lies in the refined product, not the raw black gloopy stuff. And that refining needs aggregation. Whereas we can justify ownership/ access models based on single patient benefit – you clearly benefit from your medical records being accurate – much of the recent work on building predictive models needs aggregated data – and lots of it. This is where ownership becomes important – because patients don’t own enough data, whereas providers do. So, the obvious place for the tech firms to go are the healthcare providers, and this is what they have done – in their droves, while patient-focused tech initiatives (Google’s HealthVault) have folded.

But these tech firms have grown large by exploting the network effect – if you are on Facebook, your friends are more likely to join; if you use Gmail, you are more likely to use Google search, etc. The problem with the network effect is that is creates huge monopolies – Google is responsible for 75% of all search, and >90% of search done on mobile devices – with attendant asymmetries: we can’t really expect individual healthcare providers to negotiate with tech companies on an even basis, and, arguably, it wouldn’t be a good use of the resources to do so. The fate of healthcare providers is fairly benign compared to governments from low and middle income countries, where tech firms can operate with relative impunity.

So, we are in a quandary. Healthcare providers “sort-of” own the healthcare data from their patients, but under a set of restrictions which mean that they can’t share them. Even if they could share the data with the tech firms, the tech firms will reap the benefits, not the providers. Individual patients don’t have enough data to interest the tech firms, and are no more likely to be able to negotiate with the firms than providers are. But underneath all of this, we can see that there could be a “good” outcome: Sharing data to build better medicine could benefit us all, but we don’t seem to be able to get there.

This is the “tragedy of the anti-commons”. The tragedy of the commons is well described – shared resources being over-utilised and spoiled – but its younger twin is of interest here. It occurs when a socially desirable outcome is blocked from happening due to problems with permissions and licensing. Although there are various solutions, a historic approach has been through compulsory purchase. These aren’t just for railways – they were used to establish sewers in the 1400s – but they provide one way of bridging the gap in the anti-commons scenario.

I would stress that I am NOT advocating compulsory purchase of medical data. But we need to find a way to ensure that we can share data, but that the data access is controlled, and the benefits acrue to those who collected and hold the data – whether that be patients, providers or both. And we need a way of recalibrating the relationship between patients and healthcare providers and tech firms. In part, I am not advocating for compulsory purchase of patient data because it already happens. The current rules we have around data release allow the tobacco industry to access data on 179 000 patients with lung cancer, but don’t allow brain tumour patients to share their details.

I think there is a way. We could use a form of smart electronic contract for patients to give permission to share their (anonymised, secured) data with tech firms. The contract would lie between the health care provider and the tech firm, with patients giving their permission (not their data, which in many cases they don’t hold) for the provider to share their data. That permission might be given more generally (“share as you like”) or more narrowly (“share if its to do with research into my disease”). However, that permission could be revoked, or changed, and it would be relatively easy to specify that my data could only be used if the resulting models were made available to all NHS organisations – or perhaps all NHS organisations and for healthcare from LMICs.

This option may seem far-fetched to some, but there is some infrastructure that might make this work (it might actually be a sensible use case for b****chain), although I am open-minded as to whether that would be the ideal solution. But we need to find some way of ensuring that we can aggregate and share data, while making sure that the beenfits flow to patients and providers.

There are no easy answers here; Perhaps we should reconsider healthcare providers as being data aggregators – a form of unionisation for the 21st Century. The model of patients owning data doesn’t really make sense, and data isn’t oil. This isn’t a bad thing – plankton didn’t get much benefit from the oil industry.

The analogies we use can trap us as much as they enlighten, so perhaps we need a better analogy. Perhaps a better analogy is music – my data is only a few notes, but together our data can be symphonic.

Note: A version of this was delivered as a talk at the RCR “Grand Challenges in AI” meeting on the 16th May 2018

Photo by ev on Unsplash


Categories: AI

Tagged as: , ,

1 reply »

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s