mother with baby

Harness the Power of FDA Labeling

How ThinkTrends RX is being applied to FDA Labeling to extract insights
Jyotiska Biswas
July 1st, 2023
FDA drug labeling is the most authoritative source of drug information in the United States and contains a vast wealth of information. Although the primary audience of FDA drug labeling has traditionally been healthcare professionals, recently there has been a great deal of interest in harnessing this information for research purposes.

PubMed search for term "FDA Labeling"

A quick PubMed search of “FDA labeling" retrieved over 9,000 articles as of October 2022. This trend is increasing: in 2001, there were only 109 articles published related to “FDA labeling” but by the year 2021 that number had increased to 653 articles.

The FDA has specifically expressed interest in applying natural language processing (NLP) in order to automatically extract information from FDA labeling. In 2020, ThinkTrends was awarded the FDA “Lactation Labeling Project,” in which the ThinkTrendsRx platform was utilized. The objective of this project was to systematically analyze the lactation information contained in the labeling of all FDA-approved drugs and to summarize this analysis. Using NLP technology, ThinkTrends Rx systematically identified and retrieved entities (e.g., adverse reactions, use in specific populations, etc.) and relations in labeling as necessary to create training/test/ground-truth datasets for various AI models.

"The technology in ThinkTrends RX is impressive. We were able to efficiently extract and catalog useful lactation information and document clinical studies for over 2,000 drugs. This project has the potential for significant public health impact for nursing mothers and infants. Analyzing lactation information in labeling is just the tip of the iceberg!"
-Dr. Joseph Tonning (former Senior Medical Officer, FDA CDER)

The FDA labeling lactation is just one example of how FDA labels can be mined for crucial data. ThinkTrends Rx provides the crucial tools that allow for this data to be usable. Depending on an organization’s needs, ThinkTrends Rx is customizable and can apply any mappings as necessary (e.g., MedDRA, UMLS) when processing the label data. These tools greatly help scientists and reviewers process, categorize, and analyze images of drug containers/carton packaging and complex forms/reports (PDF and XML).

The wealth of information contained in labeling cannot be overstated. The following is just a partial list of information contained in labeling:

  • Chemical Structure
  • Efficacy information
  • Warnings and precautions
  • Adverse reactions
  • Drug interactions
  • Potential effects on fertility
  • Use during pregnancy and lactation
  • Use in pediatric and geriatric populations
  • Use in chronic disease
  • Potential for drug abuse/dependence
  • Mechanism of action
  • Pharmacodynamics
  • Pharmacokinetics
  • Potential for carcinogenesis and mutagenesis
  • Pharmacogenomics


Extraction and collation of the information for subsequent analysis is important not only for the FDA but also for the pharmaceutical industry and other clinical research organizations. Thus far this information has largely been left untapped. Applying a tool like ThinkTrendsRx allows for the rapid synthesis of this immense store of information, allowing reviewers to answer complex questions without wading through mountains of unstructured data.