Select Page

Combines EMR and NLP Data to Deliver the Industry’s Most Comprehensive Data Set for Cohort Analysis and Site Selection

Cambridge, MA, December 13, 2017 — TriNetX, the global health research network for healthcare organizations, biopharmaceutical companies, and Contract Research Organizations (CROs), today announced the general availability of its Natural Language Processing (NLP) service. TriNetX’s NLP service utilizes sophisticated algorithms to extract clinical facts from physician notes and clinical reports, links them with other Electronic Medical Record (EMR) data, and makes the combined data available for assessing study feasibility, protocol design, site selection, and subsequent identification of patients for clinical trials.

Members access TriNetX Live to analyze patient populations and perform “what-if” analyses in real-time. Users of the platform are presented with aggregate views, but each data point in the TriNetX network can be traced to healthcare organizations who have the ability to identify individual patients, allowing clinical researchers to develop virtual patient cohorts that can then be re-identified for potential recruitment into a clinical trial.

In addition to the structured data already available in TriNetX such as demographics, diagnoses, procedures, medications, labs, genomics, and deep oncology data, NLP provides access to data derived from clinical documentation including discharge summaries, radiology reports, pathology reports, and others that contain critical information that is important to more accurately identify candidates for a clinical trial. TriNetX’s NLP service mines unstructured data, such as measurements and observation on ECOG performance status for oncology, NYHA classification, Ejection Fraction, and Corrected QT Interval for cardiac studies. NLP also collects information from clinician notes for patients whose hospital medical records may be incomplete due to visits to multiple healthcare facilities. The extracted data is subsequently mapped to standardized clinical terminologies that can be easily analyzed by researchers using TriNetX Live.

“With the TriNetX NLP service, we are able access data that our researchers have been very interested in for some time,” said Jack London, PhD., Informatics Core Director, Sidney Kimmel Cancer Center at Jefferson and Professor of Cancer Biology, Thomas Jefferson University “Extracting this data from our clinical text reports helps our investigators better define and identify patient cohorts, and provides a larger data set for our academic and industry clinical research collaborations.”

“The combination of structured EMR data and NLP-extracted data is more powerful than either data set alone,” said Alex Eastman, Senior Director of Product Management at TriNetX. “NLP helps fill gaps in structured EMR data, and vice versa. You end up with a richer data set that supports advanced analyses.”

The TriNetX NLP service is based on technology from Averbis, a text-mining and machine-learning company headquartered in Germany. TriNetX chose Averbis as its NLP partner because of Averbis’ experience applying NLP within healthcare, the accuracy of their solution across various data domains including oncology, and the ability to work with multiple languages.


About TriNetX
TriNetX is the global health research network enabling healthcare organizations, biopharma and contract research organizations (CROs) to collaborate, enhance trial design, accelerate recruitment and bring new therapies to market faster. Each member of our community shares in the consolidated value of our global, federated health research network that connects clinical researchers in real-time to the patient populations which they are attempting to study. For more information, visit

Media Contacts:
Julia Weber
Racepoint Global
(617) 624-3234