Navigating the regulatory frameworks that ensure the safety and efficacy of medical devices can be challenging, especially across different regions. These frameworks often require redundant testing, slowing down the process of getting innovations to patients. This study leverages Natural Language Processing (NLP) to analyze 664 regulations and guidelines from the USA, EU, and China over the past decade, covering over 200 million tokens (individual words and sub-word units processed by Bidirectional Encoder Representations from Transformers’s (BERT) tokenizer). We categorize regulations into key phases—such as animal studies, clinical trials, and other testing stages—and use BERT to perform Named Entity Recognition (NER), identifying key regulatory terms and entities. By converting these texts into numerical representations and segmenting them by phase, country, and year, we compare jurisdictional requirements and assess their alignment. Additionally, we apply Latent Dirichlet Allocation (LDA) for theme analysis to observe changes in regulatory focus over time, reflecting evolving priorities and challenges. Our analysis reveals notable semantic similarities and differences between countries and phases. For instance, the closest alignment in animal study regulations is between China and the USA, with a mean cosine distance of 0.33. These findings highlight the computational potential in regulatory science, offering valuable insights for researchers, policymakers, and industry professionals.
Journal article
Association for Computing Machinery (ACM)
2026-04-30T00:00:00+00:00
7
1 - 34
33