Background: In view of the low prevalence and high clinical heterogeneity of rare diseases, the multi-center integration of research data encounter challenges. As a structured data governance tool, the minimum data set is expected to solve the contradiction between the standardization of research data and the personalized collection of clinical data in the field of rare diseases.
Objectives: The aim of this paper is to review the current status and development mode of minimum data sets in rare diseases, explore the related challenges and opportunities.
Methods: We conducted searches in PubMed, Embase, Cochrane, CNKI, WANFANG and VIP with‘Rare disease, minimum data set (MDS)’ as keywords.
Results: A total of 3,923 literatures were retrieved. After deduplication and screening, 49 literatures were obtained, including 28 literatures with minimum data set related to rare diseases and 21 literatures with minimum data set. In the field of rare diseases, the minimum data set is mainly used for the standardized collection and using of research data. Based on the current experience, this paper summarizes the process and some key points of developing the minimum data sets related to rare diseases.
Conclusions: Currently, there are still some challenges in the development, data collection and processing of rare disease-related minimum data sets, such as difficult coordination of project variables, difficult control of data quality, and difficult guarantee of data security. However, there are also opportunities for their development: the government and society are paying increasing attention, big data contributes to rare disease research, and the demand for linkage of production and research is growing. In the future, it is promising to utilize the minimum data sets to promote research on rare diseases by integrating the characteristics of the big data era and collaborating with the government, enterprises and other parties.