University of Maryland School of Pharmacy, Maryland, United States
Background: Eighty-one percent of peri or postmenopausal women in the US experience at least one menopause related symptom, such as vasomotor disturbances, depression, and insomnia, which can impact overall health and medical treatment decisions. The identification and characterization of menopause symptoms in clinical and epidemiological research remain underdeveloped, often relying on self-reported questionnaires. The emergence of RWD offers an opportunity to enhance the characterization of menopause symptom clusters at a population level.
Objectives: To identify and evaluate the use of algorithmic approaches in capturing peri or post menopause symptom clusters in RWD sources.
Methods: We performed a systematic review according to PRISMA and TRIPOD+AI guidelines. Our search was conducted across PubMed, EMBASE, and Ovid MEDLINE for full length, original articles in English, published from 2000 to 2025. Studies employing algorithmic methods for identification of peri and post menopause symptom clusters were included, with a focus on machine learning, natural language processing (NLP), and data mining techniques. Covidence software was utilized for de-duplication, screening, and data extraction. We classified studies into predicted symptom cluster types, data sources used for algorithm development, algorithmic approaches used for prediction, and methodological rigor.
Results: A total of five (n=5) studies were included among 43 screened papers. Three papers primarily targeted identification of menopause status (peri or post menopause), while two emphasized detecting specific symptoms such as vulvovaginal atrophy or problematic menopause. Most studies (n=4) used electronic health records as data sources. Two studies used data mining techniques, two applied machine learning (classification and regression trees), and one NLP. Only three studies conducted internal validation of their findings compared with a prepared dataset, while others relied on self-set metrics.
Conclusions: Algorithmic approaches show promise in capturing menopause symptom clusters within RWD, providing valuable insights into symptom prediction, diagnosis, and patient outcomes. Next steps include expanding our search strategy to include symptom specific algorithms. Further algorithm refinement and validation is needed to optimize their effectiveness in pharmacoepidemiology, clinical, and policy applications.