Transforming Medical Regulations into Numbers: Vectorizing a Decade of Medical Device Regulatory Shifts in the USA, EU, and China

Han Y., Bergmann JHM.

Navigating the regulatory frameworks that ensure the safety and efficacy of medical devices can be challenging, especially across different regions. These frameworks often require redundant testing, slowing down the process of getting innovations to patients. This study leverages Natural Language Processing (NLP) to analyze 664 regulations and guidelines from the USA, EU, and China over the past decade, covering over 200 million tokens (individual words and sub-word units processed by Bidirectional Encoder Representations from Transformers’s (BERT) tokenizer). We categorize regulations into key phases—such as animal studies, clinical trials, and other testing stages—and use BERT to perform Named Entity Recognition (NER), identifying key regulatory terms and entities. By converting these texts into numerical representations and segmenting them by phase, country, and year, we compare jurisdictional requirements and assess their alignment. Additionally, we apply Latent Dirichlet Allocation (LDA) for theme analysis to observe changes in regulatory focus over time, reflecting evolving priorities and challenges. Our analysis reveals notable semantic similarities and differences between countries and phases. For instance, the closest alignment in animal study regulations is between China and the USA, with a mean cosine distance of 0.33. These findings highlight the computational potential in regulatory science, offering valuable insights for researchers, policymakers, and industry professionals.

DOI

10.1145/3793533

Type

Journal article

Publisher

Association for Computing Machinery (ACM)

Publication Date

2026-04-30T00:00:00+00:00

Volume

7

Pages

1 - 34

Total pages

33

Permalink More information Close