Data Without Borders

Guest blog post by Dr Austin Tanney, Analytics Engines

Please repeat after me….

“I believe in a world where data has no borders, where data can be shared between people and  between regions for the good of humanity”

Does this sound reasonable? Theoretically, sure, of course. In reality? Probably not so much. This title and opening paragraph is my first venture into writing clickbait. Did it work? Someone needs to send the me the google analytics info on the clicks of this post vs the others.

In reality this concept, while noble and admirable, is probably a long way away from reality for a very wide range of reasons. When I step outside of my idealistic fantasy world and into the disappointing real world that I find that the real situation with data is complex and convoluted. Never mind sharing on a global scale for the greater good, data is siloed even within organisations.

Many many years ago I was involved in a large scale genomic sequencing project, I was personally responsible for the data management and analytics. When I look back to this, I’m pretty sure that within the company I worked, there were probably about 4 people who could access this data. Not for security reasons, nor for technical reasons, more because no one knew where it was saved. Or could access this system. Or knew it existed.

Actually now that I look back at this through the mists of memory, I can remember exactly where I stored and analysed this data. It was on a little Linux cluster that I built and yes, there were, I believe, 4 people who could access this.

This was, as I said, a long time ago. Things have clearly changed. Data is managed much better now. No one works with important data in small limited access servers or on their laptops… right?


In reality I have found that this is actually still fairly common among bioinformaticians and data scientists. Of course, in an enterprise environment data management is much more stable. There are lots of data management solutions. Lots of them. In the world of healthcare, there are many different systems and it’s not uncommon for a single healthcare organisation to use a number of these. I was in a meeting recently where the team we were talking to had five different data capture and management systems in place. So the data was well managed, stored, backed up and easy to find.

So is there still a problem?

Most definitely. In this particular case, none of these systems talked to each other.

Data integration is a clear and fundamental challenge and problem across the board in many industries, and healthcare is no exception. It’s one thing when lack of data integration is impacting a company’s bottom line or their ability to successfully target their customers with marketing campaigns. It’s a different matter when its impacting the optimal delivery of healthcare.

I could write another entire blog post about the use of data for  healthcare improvement but for today let me focus on MIDAS.

In the MIDAS project we are focusing on data integration for policy makers. Across the board, globally,  healthcare organisations and governments are moving towards outcomes based accountability (possibly another future blog post). To do this data is a critical necessity. It’s impossible to say whether what you are doing is having positive impacts if you can’t measure it. Of course one of the core challenges here is that we are not just talking about integrating multiple datasets within the health provider, but also linking to external datasets. It’s pretty easy to show from datasets within the healthcare system that you have reduced waiting lists or treated more people, but when a government sets a priority, as we have in Northern Ireland, to reduce health inequalities and tackle issues with mental health, data needs to be gathered from other areas.

In the MIDAS project we will be integrating data in a meaningful way (hence the name) and ensuring that the policy areas we focus upon can be properly evaluated using broad and diverse datasets from a range of sources that can give some really valuable insight. We at Analytics Engines are providing the data analytics platform and tools that will enable the highly skilled team of partners in the project to work with the data and actually answer the questions that are being asked by the policy board.

It’s been a really enjoyable and rewarding experience so far working with the various teams and hearing the challenges faced by the policy board members. I think that over the next few years we have the potential to do something hugely important that will really move us in the right direction towards data driven health policy.