The healthcare industry has a ton of structured and unstructured data that needs to be brought together to reap the holistic benefits of data-driven decision-making. But, did you know that today, almost 80% of healthcare data is unstructured?
Structured data typically involves values like blood test reports, name, age, height, weight and the like. Unstructured data is data that needs to be reformatted to translate into machine language.
For example, let’s say a patient has taken an MRI scan. While the MRI results include an interpretation of the results, usually it is the specialists who draw the correct conclusions from the report, which may vary to a slight degree from the lab interpretation. And, this information is usually undocumented and thus becomes unstructured data. Another example of unstructured data is clinical notes and discharge summaries.
Electronic health records (EHRs) post COVID
One might argue that combining both data sets should not be a challenge given the rapid adoption of electronic health records (EHRs). And, it is true. A report indicates that between 2015 and 2021, the global EHR market touched USD 26 billion, and until today, has only been fueled by accelerated digitisation post COVID-19.
But, what we don’t see is setting up an EHR has been more prevalent among pharmacies, labs, clinics and small healthcare units, because implementation is simple and largely straightforward.
Implementing it in large hospitals or healthcare units, on the other hand, requires a lot more planning, process and skills to operate it. Moreover, the goal is not merely to bring together all formats of data but to have the requisite NLP technologies that can translate it in a format that can be read, interpreted and insightful.
Connecting structured and unstructured healthcare data
To start with, there is a straightforward process that healthcare companies can adopt to bring together structured and unstructured data;
- The first step lies in identifying the different data sources. It could be from a wide range of sources starting from straightforward patient information (like name, age and height), to abstract information from doctor’s notes, images, lab reports, and even socioeconomic data.
- The second step is in bringing together all this data into a centralised source. This requires a greater connectivity between the different departments in the organisation, from hospital units to administrative teams. In fact, it also requires process standardisation with external connectors like insurance companies too.
- The third step is in connecting all data points and interpreting these data points and deriving holistic and meaningful insights.
A core technology that binds these three steps together is AI and NLP technologies. AI and NLP technologies can be trained to read, listen and interpret medical records, documents and reports, and convert them into actionable insights.
A Merit expert says, “But the trick lies in ensuring that the technologies are trained to understand and interpret the same concepts in different languages and jargon. For example, let’s say a patient has consulted a doctor about a specific illness. The terms and jargon the doctor uses may be different from the ones that a medical insurance company may have laid out for claim eligibility. So, it becomes important to train the technology to understand a wide range of usage and application of the same terms, to function effectively.”
Case Study: Streamlining Document Collection & Metadata Management with AI & Intuitive Technologies
Let’s look at a case study of how Merit Data and Technology worked with a UK-based pharmaceutical company to solve their data challenge.
The company was a leading provider of data and insights to the healthcare community in the UK, and their analysis, visualisation and service modelling techniques were being used by companies (including the NHS) to drive better decisions.
But, it faced a number of challenges in bringing together structured and unstructured data seamlessly;
- It relied on an external service provider that often missed manually downloaded documents
- It couldn’t provide quick and easy access to the numerous documents and information that were being published in the UK healthcare community everyday
- It couldn’t segregate, structure and index the vast array of documents and information like annual reports, newsletters and the like
Merit Data and Technology undertook a series of steps to resolve these data challenges;
- It developed automated, scalable technologies which replaced the manual document download process. One was a Python-based scraping engine which scouted and collected documents from 1600+ sources. The second was an index API to structure and index documents, and an elastic search to accelerate the retrieval of documents
- It also developed an intuitive user interface to quicken the search process. The interface also helped with the easy uploading of documents, with tracking changes and re-categorising information based on user needs.
The result? Using these technologies, the client was able to increase the volume of documents by 5X, from 2,000+ to 10,000+ documents a week. And, it was able to increase the in-source coverage by 60%, from 1000 to 1600 sources.
In this blog, we spoke largely about combining and interpreting data from multiple sources largely in the doctor and patient-care perspective. But, in the larger picture, like we saw in the case study above, vast amounts of data can be used to bring efficiency and profitability across different departments in a healthcare organisation; from building more targeted marketing campaigns, and improving CRM functionality in administration, to channelising financial spends more efficiently, and using data insights to forecast and drive better business strategy.
Merit’s Expertise in Healthcare Data Harvesting
Our state-of-the-art data harvesting engine collects high-volume, industry-specific data at 4 times the speed, with 30% more accuracy than normal scrapers, and at a lower cost.
Our solutions help some of the world’s largest healthcare brands seamlessly deliver data and insights to their end customers, including:
- Delivering curated content from thousands of online documents or PDFs
- Aggregating millions of specialised, industry-specific data points
To know more, visit: https://www.meritdata-tech.com/service/code/data-harvesting-aggregation/
Related Case Studies
-
01 /
Formularies Data Aggregation Using Machine Learning
A leading provider of data, insight and intelligence across the UK healthcare community owns a range of brands that caters to the pharmaceutical sector and healthcare professionals in the UK.
-
02 /
Automotive Data Aggregation Using Cutting Edge Tech Tools
An award-winning automotive client whose product allows the valuation of vehicles anywhere in the world and tracks millions of price points and specification details across a large range of vehicles.