Machine Learning-Derived Prenatal Predictive Risk Model to Guide Intervention and Prevent the Progression of Gestational Diabetes Mellitus to Type 2 Diabetes: Prediction Model Development Study

Mukkesh Kumar; Li Ting Ang; Cindy Ho; Shu E Soh; Kok Hian Tan; Jerry Kok Yen Chan; Keith M Godfrey; Shiao-Yng Chan; Yap Seng Chong; Johan G Eriksson; Mengling Feng; Neerja Karnani

doi:10.2196/32366

Machine Learning-Derived Prenatal Predictive Risk Model to Guide Intervention and Prevent the Progression of Gestational Diabetes Mellitus to Type 2 Diabetes: Prediction Model Development Study

JMIR Diabetes. 2022 Jul 5;7(3):e32366. doi: 10.2196/32366.

Authors

Mukkesh Kumar^{1

2

3}, Li Ting Ang^{1

2}, Cindy Ho^{1

2}, Shu E Soh⁴, Kok Hian Tan^{5

6}, Jerry Kok Yen Chan^{7

8

9}, Keith M Godfrey^{10

11}, Shiao-Yng Chan^{1

7}, Yap Seng Chong^{1

7}, Johan G Eriksson^#^{1

7

12

13}, Mengling Feng^#^{3

14}, Neerja Karnani^#^{1

2

15}

Affiliations

¹ Singapore Institute for Clinical Sciences, Agency for Science Technology and Research, Singapore, Singapore.
² Bioinformatics Institute, Agency for Science Technology and Research, Singapore, Singapore.
³ Saw Swee Hock School of Public Health, National University of Singapore, National University Health System, Singapore, Singapore.
⁴ Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
⁵ Division of Obstetrics and Gynecology, KK Women's and Children's Hospital, Singapore, Singapore.
⁶ Obstetrics and Gynecology Academic Clinical Programme, Duke-NUS Graduate Medical School, Singapore, Singapore.
⁷ Department of Obstetrics and Gynaecology and Human Potential Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
⁸ Department of Reproductive Medicine, KK Women's and Children's Hospital, Singapore, Singapore.
⁹ Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, Singapore.
¹⁰ MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, United Kingdom.
¹¹ National Institute for Health and Care Research Southampton Biomedical Research Centre, University Hospital Southampton NHS Foundation Trust, Southampton, United Kingdom.
¹² Department of General Practice and Primary Health Care, University of Helsinki, Helsinki, Finland.
¹³ Folkhälsan Research Center, Helsinki, Finland.
¹⁴ Institute of Data Science, National University of Singapore, Singapore, Singapore.
¹⁵ Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.

^# Contributed equally.

PMID: 35788016
PMCID: PMC9297138
DOI: 10.2196/32366

Abstract

Background: The increasing prevalence of gestational diabetes mellitus (GDM) is concerning as women with GDM are at high risk of type 2 diabetes (T2D) later in life. The magnitude of this risk highlights the importance of early intervention to prevent the progression of GDM to T2D. Rates of postpartum screening are suboptimal, often as low as 13% in Asian countries. The lack of preventive care through structured postpartum screening in several health care systems and low public awareness are key barriers to postpartum diabetes screening.

Objective: In this study, we developed a machine learning model for early prediction of postpartum T2D following routine antenatal GDM screening. The early prediction of postpartum T2D during prenatal care would enable the implementation of effective strategies for diabetes prevention interventions. To our best knowledge, this is the first study that uses machine learning for postpartum T2D risk assessment in antenatal populations of Asian origin.

Methods: Prospective multiethnic data (Chinese, Malay, and Indian ethnicities) from 561 pregnancies in Singapore's most deeply phenotyped mother-offspring cohort study-Growing Up in Singapore Towards healthy Outcomes-were used for predictive modeling. The feature variables included were demographics, medical or obstetric history, physical measures, lifestyle information, and GDM diagnosis. Shapley values were combined with CatBoost tree ensembles to perform feature selection. Our game theoretical approach for predictive analytics enables population subtyping and pattern discovery for data-driven precision care. The predictive models were trained using 4 machine learning algorithms: logistic regression, support vector machine, CatBoost gradient boosting, and artificial neural network. We used 5-fold stratified cross-validation to preserve the same proportion of T2D cases in each fold. Grid search pipelines were built to evaluate the best performing hyperparameters.

Results: A high performance prediction model for postpartum T2D comprising of 2 midgestation features-midpregnancy BMI after gestational weight gain and diagnosis of GDM-was developed (BMI_GDM CatBoost model: AUC=0.86, 95% CI 0.72-0.99). Prepregnancy BMI alone was inadequate in predicting postpartum T2D risk (ppBMI CatBoost model: AUC=0.62, 95% CI 0.39-0.86). A 2-hour postprandial glucose test (BMI_2hour CatBoost model: AUC=0.86, 95% CI 0.76-0.96) showed a stronger postpartum T2D risk prediction effect compared to fasting glucose test (BMI_Fasting CatBoost model: AUC=0.76, 95% CI 0.61-0.91). The BMI_GDM model was also robust when using a modified 2-point International Association of the Diabetes and Pregnancy Study Groups (IADPSG) 2018 criteria for GDM diagnosis (BMI_GDM2 CatBoost model: AUC=0.84, 95% CI 0.72-0.97). Total gestational weight gain was inversely associated with postpartum T2D outcome, independent of prepregnancy BMI and diagnosis of GDM (P=.02; OR 0.88, 95% CI 0.79-0.98).

Conclusions: Midgestation weight gain effects, combined with the metabolic derangements underlying GDM during pregnancy, signal future T2D risk in Singaporean women. Further studies will be required to examine the influence of metabolic adaptations in pregnancy on postpartum maternal metabolic health outcomes. The state-of-the-art machine learning model can be leveraged as a rapid risk stratification tool during prenatal care.

Trial registration: ClinicalTrials.gov NCT01174875; https://clinicaltrials.gov/ct2/show/NCT01174875.

Keywords: Asian populations; diabetes management; digital health; gestational diabetes mellitus; machine learning; prediction models; prenatal care; public health; risk factors; type 2 diabetes.

©Mukkesh Kumar, Li Ting Ang, Cindy Ho, Shu E Soh, Kok Hian Tan, Jerry Kok Yen Chan, Keith M Godfrey, Shiao-Yng Chan, Yap Seng Chong, Johan G Eriksson, Mengling Feng, Neerja Karnani. Originally published in JMIR Diabetes (https://diabetes.jmir.org), 05.07.2022.

Associated data

ClinicalTrials.gov/NCT01174875

Grants and funding

MC_UU_12011/4/MRC_/Medical Research Council/United Kingdom