We're Hiring Menu
Research

Extracting Value from Electronic Health Records By Processing Unstructured Data

Author: Verana Health
July 2019
Back

Verana Health has the unique ability to generate insights from some of the largest and most comprehensive EHR-based specialty databases in medicine through our partnerships with medical associations.

Structured fields in EHRs hold information that adheres consistently to a specific format, including ICD-9 and ICD-10 codes, CPT codes, and an array of medication codes (SNOMED, NDC, etc). However, these data alone fail to capture the entire clinical narrative. Approximately 80% of clinical data in electronic health records is found in unstructured physician notes, including data needed to generate insights on clinical outcomes and determine longitudinal trends in patient care. Since unstructured notes are free text, robust processing capabilities are required to extract and curate the information contained in raw text in a way that is structured, standardized, and reliable. 

Verana has developed proprietary algorithms to clean and model data specifically from unstructured notes to derive insights. These algorithms allow Verana to extract, process, or tag phrasing while taking note context into account. As we derive insights from this data, Verana ensures all data has been de-identified in compliance with HIPAA and in-line with Verana’s commitment to patient security and privacy. 

Some data necessary for clinical care are captured in unstructured notes, such as eye laterality in ophthalmology. While in some cases laterality can be determined by ICD-10 codes, there are circumstances where laterality can only be found in unstructured notes, such as for procedure laterality. As each eye for a given patient can have a different clinical status, it is vital to differentiate between both eyes when analyzing eye health, treatments, and outcomes. To accurately extract this information from unstructured notes, Verana has developed algorithms that enable consideration of context when categorizing laterality to exclude uses of the terms “right” and “left” that are irrelevant (e.g. “the patient is right”, or “the patient has left”). 

Outcomes data, such as intraocular pressure or visual acuity, are necessary measures for evaluating patient treatment and research, and also often lie in unstructured notes. Verana’s data processing capabilities extract outcomes values by identifying and standardizing outcomes measures across EHRs and distinguishing clinically significant data, such as IOP or VA, from other numerical data found in the notes.

Verana uses tailored functions to extract information for clinical care from unstructured text.

Verana is able to extract clinical insights from raw EHR data found in the comprehensive clinical data registries. By using data processing techniques to characterize and model text, Verana can unlock information from unstructured notes to derive insights necessary for clinical care and research that were not previously available.