Figures
Abstract
Background
Identification of mother-infant pairs predisposed to early cessation of exclusive breastfeeding is important for delivering targeted support. Machine learning techniques enable development of transparent prediction models that enhance clinical applicability. We aimed to develop and validate two models to predict cessation of exclusive breastfeeding within one month among infants born after 35 weeks gestation using machine learning techniques.
Methods
Utilizing a nationwide dataset from Statistics Denmark, including infants born between the 1st of January 2014 and the 31st of December 2015, we employed random forest machine learning to develop two predictive models. The first model included 11 well-established factors associated with cessation of exclusive breastfeeding within one month. The second model was expanded to include 21 additional factors associated with complications during pregnancy and delivery that potentially impede breastfeeding. Feature importance was applied to elucidate the factors driving model predictions.
Results
The dataset comprised 110,206 infants and 106,835 mothers. The first model predicted cessation of exclusive breastfeeding within one month with an area under the receiver operating curve of 62.0% (95% confidence interval 61.3% - 62.7%) and an accuracy of 60.4% (95% confidence interval 59.8% - 61.0%). The second model predicted cessation of exclusive breastfeeding within one month with an area under the receiver operating curve of 62.2% (95% confidence interval 61.5% - 62.9%) and an accuracy of 60.0% (95% confidence interval 59.3% - 60.6%). In both models, birthplace, maternal education, delivery mode, and maternal body mass index were the most important factors influencing the overall model performance.
Citation: Nejsum FM, Wiingreen R, Jensen AK, Løkkegaard ECL, Mølholm Hansen B (2025) Predicting early cessation of exclusive breastfeeding using machine learning techniques. PLoS ONE 20(1): e0312238. https://doi.org/10.1371/journal.pone.0312238
Editor: Astrid M. Kamperman, Erasmus Medical Center, NETHERLANDS, KINGDOM OF THE
Received: April 28, 2024; Accepted: October 4, 2024; Published: January 9, 2025
Copyright: © 2025 Nejsum et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data cannot be shared publicly because of Danish law and protection of patient privacy. Data are available through secure online access to Statistics Denmark. Further information regarding data access can be found on Statistics Denmark’s website http://dst.dk/en/ TilSalg/Forskningsservice or by contacting Statistics Denmark on e-mail forskningsservice@dst.dk or phone +4539173130. We cannot guarantee access to data, but we will gladly assist interested institutions upon any requests.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Breastfeeding confers multiple health benefits for infants and mothers that extend beyond solely the neonatal period [1, 2]. In accordance, the World Health Organization recommends exclusive breastfeeding for the first six months after birth [3]. In Denmark, the preponderance of expectant mothers expresses an intention to breastfeed [4]. Despite these intentions, 40% of mothers initiating breastfeeding encounter early breastfeeding problems [5]. Thus, the adherence to the World Health Organization’s recommended six-month exclusive breastfeeding duration is low, with approximately 11% compliance in Denmark [6–8].
The first step to achieve exclusive breastfeeding for six months after birth is to establish exclusive breastfeeding. Breastfeeding establishment is a multifaceted process. Many mother-infant pairs establish breastfeeding without any complications. Existing evidence consistently affirms the influence of several factors, including maternal age, maternal smoking, maternal body mass index, socioeconomic status, parity, delivery mode, infant sex, gestational age, and being small-for-gestational-age [9–13]. A recent, nationwide cohort study, encompassing more than 100,000 infants, reaffirms the significance of these associations [14]. Nevertheless, it remains plausible that additional unexplored factors also exert an influence on breastfeeding establishment. Complications during pregnancy and delivery can disrupt crucial practices that are important for breastfeeding establishment e.g., immediate skin-to-skin contact and early initiation of breastfeeding [15]. Many mothers experience complications during pregnancy and delivery and, despite compelling physiological rationales, the influence of such complications on breastfeeding establishment remains insufficiently explored.
Several studies have attempted to develop prediction models aimed at enhancing breastfeeding interventions [16–20]. Nevertheless, a validated model to accurately predict well-established breastfeeding that persists beyond hospital discharge remains absent. Existing models include limited predictors and are constructed using relatively small datasets. Many breastfeeding problems can be remedied by timely support [15] and this underscores the need for further research aimed at targeting quality breastfeeding interventions.
In recent decades, machine learning has increasingly been used in prediction models and has the potential to increase model performance [21]. Prediction models can be developed using machine learning techniques. Machine learning is a subset of artificial intelligence that employs a data-driven approach to model development [21, 22]. It is increasingly applied in various medical fields [22]. A previous study found that machine learning techniques produced more accurate model predictions of in-hospital breastfeeding compared to traditional statistics [16]. Recent advances in explainable artificial intelligence enable transparent explanations of model predictions that enhance clinical applicability [23].With this study, we aimed to develop and validate two models to predict cessation of exclusive breastfeeding within one month among infants born after 35 weeks gestation using machine learning techniques, with potential for application in the hospital immediately after birth to target support interventions. We hypothesized that including additional predictors in the model would produce more accurate predictions.
Methods
Using a retrospective nationwide cohort of infants born in Denmark between the 1st of January 2014 and the 31st of December 2015, we developed and validated two models predicting cessation of exclusive breastfeeding within one month. The cohort has been thoroughly characterized in a previous study [14].
Data source
Our dataset was obtained from multiple nationwide registers held by Statistics Denmark and The Danish Health Data Authority including The Danish National Child Health Register, The Danish National Patient Register, The Danish Medical Birth Register, The Danish Education Registers, The Danish Register of Causes of Death, and The Danish Civil Registration System. In Denmark, all individuals are assigned distinctive Central Personal Register numbers upon birth or immigration, which enables consistent linkage of data across the registers [24].
Participants
The study population included mother-infant pairs born in Denmark between the 1st of January 2014 and the 31st of December 2015. Infants meeting the following criteria were excluded from the study population: Gestational age below 35 weeks and 0 days, missing data on gestational age or birthweight, gestational age or birth weight outliers, and death or migration of the infant or mother within the first month after birth. The 35 weeks cutoff was chosen because, in Denmark, most infants born at gestational ages below 35 weeks and 0 days routinely are admitted to neonatal wards where they receive additional support to establish breastfeeding. Outliers were excluded under the presumption that they stemmed from errors in data coding. Gestational age outliers were defined as gestational ages at birth above 44 weeks and 0 days. Birth weight outliers were defined as birth weights deviating more than five standard deviations from the mean of the study population calculated as described by Marsál et al. [25].
Outcome
The outcome was cessation of exclusive breastfeeding within one month, as exclusive breastfeeding usually is well-established at this point and infants born at gestational ages above 34 weeks and 6 days routinely are discharged from the hospital beforehand.
Data on cessation of exclusive breastfeeding were retrieved from The Danish National Child Register [26]. In the Danish National Child Register, exclusive breastfeeding is defined as feeding the infant solely with breast milk except for water and maximum one formula feeding per week after hospital discharge as described by The Danish Health Authority [27]. Thus, this definition was applied in the current study. It is an adaption of The World Health Organization’s definition of exclusive breastfeeding to suffice Danish conditions [28].
In Denmark, health visitors routinely conduct free home visits during the infant’s first year, with over 95% of parents utilizing the services [8]. In the first month, health visitors conduct minimum one home visit in the first week and one home visit between the second and fourth week. It is possible to receive extra visits. During these visits, the health visitors collect information on the date of exclusive breastfeeding cessation and subsequently report it to The Danish National Child Register. This practice has been mandatory since 2011 but data are only considered complete from the 1st of January 2014 [29]. The reporting is conducted via municipalities, which are local administrative divisions responsible for public services within specific geographic areas of Denmark. This leads to considerable delay in reporting of data to The Danish National Child Register. Further, post-registrations dating several years back are possible, thus data on can only be considered complete after years [26].
Infants, who did not initiate exclusive breastfeeding, were not registered with a record on cessation of exclusive breastfeeding in The Danish National Child Health Register [29]. Consequently, infants without a record on cessation of exclusive breastfeeding in the register were classified as having ceased exclusive breastfeeding within the first month after birth.
Predictors
We developed two models to predict cessation of exclusive breastfeeding within one month. The first model included 11 well-established risk factors for ceasing exclusive breastfeeding within one month: Maternal age, maternal pre-pregnancy body mass index, maternal smoking, maternal education, birthplace, parity, multiple birth, delivery mode, infant’s sex, gestational age, and birthweight [9–14]. Maternal education was considered an indicator of socioeconomic status and divided into four levels. Level one (lowest) comprising International Standard Classification of Education 2011 (ISCED) level 1–2, level two comprising ISCED level 3, level three comprising ISCED level 5–6, and level four (highest) comprising ISCED level 7–8 [30]. Birthplace was divided into five regions (Region A-E) corresponding to the healthcare regions of Denmark. Delivery mode was divided into vaginal delivery and cesarean section, with the latter further stratified into emergency and elective cesarean section. Gestational age was defined as completed weeks. In Denmark, gestational age is typically determined through first-trimester ultrasonography performed in approximately 92% of pregnancies [31]. Birth weight deviation was calculated as described by Marsál et al. [25] and divided into three levels: Small-for-gestational-age (below -2 standard deviations from the reference mean), appropriate-for-gestational-age (-2 to 2 standard deviations from the reference mean), and large-for-gestational-age (above 2 standard deviations from the reference mean).
The second model was expanded to include 21 additional factors. In addition to the factors included in model 1, model 2 further incorporated: Ethnicity, maternal psychiatric disease, maternal somatic chronic disease, preeclampsia and eclampsia, hemorrhage in early pregnancy (before gestational age 12 weeks and 0 days), gestational diabetes mellitus, liver disease, hemorrhage in late pregnancy (after gestational age 12 weeks and 0 days), preterm premature rupture of membranes, placenta previa, abruptio placenta, abnormal forces of labor, uterine rupture, postpartum hemorrhage, retention of placenta or membranes, perineal tear, labor induction, forceps or vacuum extraction, regional anesthesia, general anesthesia, and Apgar score at five minutes after birth. The International Classification of Diseases 10th revision (ICD-10) and NOMESCO Classification of Surgical Procedure codes used to define maternal psychiatric disease (within two years preceding birth), maternal somatic chronic disease (within ten years preceding birth), and the factors associated with complications during pregnancy and delivery can be found in S1–S3 Tables.
Missing data
Missing data were handled using multiple imputation. We generated ten imputed datasets using the R-package ‘mice’ [32]. Numeric variables were imputed using predictive mean matching, unordered categorical variables were imputed using logistic regression, and ordered categorical variables were imputed using proportional odds [32].
Statistical analysis methods
Statistical analyses were made on each of the ten imputed datasets and subsequently combined into one estimate using Rubin’s rule [33].
Data allocation for model development and validation.
The data were divided into one dataset for model development and one dataset for model validation based on the infants’ birth month to prevent bias from time-related changes. The dataset for model development comprised all infants born in January, February, March, May, June, July, September, October, and November. The dataset for model validation comprised all infants born in April, August, and December.
Model development.
To build the two prediction models, we employed Breiman’s random forest algorithm using the R-package ‘randomForest’ [34]. The random forest algorithm is a machine learning technique. It uses bootstrapped samples to construct multiple decision trees, selecting a subset of variables as potential predictors at each split. To tune the models, we adjusted the number of variables sampled at each split and the number of trees to grow in order to minimizing the out-of-bag-error. The number of variables sampled at each split was set to two and the number of trees was set to 500.
The models were trained on the dataset for model development. To train the two models, we applied ten-fold cross validation using the R-package ‘caret’ [35]. In ten-fold cross validation, the dataset is divided into ten equally sized folds. The model is iteratively trained and tested ten times. During each iteration, one distinct fold is used as the test set, while the remaining nine folds serve as the training set. This ensures that each data point is used for both training and testing, thereby enhancing the precision of the performance estimation [36]. We employed multiple metrics to evaluate the performance of the models including the area under the receiver operating curve (AUC), the area under the precision-recall curve, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and the Brier score.
Model validation.
The performance of the two models were evaluated using the dataset for model validation. We employed multiple metrics to evaluate the performance of the models including the AUC, the area under the precision-recall curve, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and the Brier score.
Feature importance analysis.
We employed feature importance analysis to gain insight into the models’ prediction making processes, which enhances clinical applicability [23]. Feature importance analysis identifies the most important predictors for overall model performance. We used the R-package ‘randomForest’ to calculate feature importance (based on mean decrease in accuracy) [32]. To assess the stability of the feature importance analyses, we employed the method of sequential rank agreement as described by Ekstrøm et al. [37] using the R-package ‘SuperRanker’. All analyses were conducted using R-version 4.2.1 [38].
Ethics
In accordance with the General Data Protection Regulation, the study was approved by the data responsible institute (Capital Region of Denmark—Approval number P-2019-280). In Denmark, register-based studies conducted for scientific purposes do not require informed consent from individual study participants or further ethical approvals.
The study was reported in compliance with the guidelines outlined in Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) [39] and Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) [40].
Results
The Danish Medical Birth Register included records on 116,585 infants born between the 1st of January 2014 and the 31st of December 2015. After exclusion of infants with gestational age below 35 weeks and 0 days (3,149; 2.7%), missing perinatal data (3,008; 2.6%), and death or migration within one month of birth (222; 0.2%), the study population comprised 110,206 infants and their 106,835 mothers. This corresponds to 94.5% of the Danish birth cohort in the two-year period (Fig 1).
1Gestational ages at birth above 44 weeks and 0 days. 2Birth weights deviating more than five standard deviations from the mean of the study population.
In the study population, 48,643 infants (44,1%) ceased exclusive breastfeeding within one month. This group included 31,857 infants (28,9%) who did not initiate exclusive breastfeeding and 16,786 infants (15,2%) who began exclusively breastfeeding but discontinued within the first month of birth.
Table 1 shows the baseline characteristics of the study population. In the study population, 2,810/110,206 (2,5%) had missing data on minimum one of the predictors in model 1, while 9,481/110,206 (8,6%) had missing data on minimum one of the predictors in model 2.
The dataset for model development comprised 83,385 infants, while the dataset for model validation comprised 26,821 infants.
Table 2 shows the models’ prediction of exclusive breastfeeding in the dataset for model validation.
Fig 2 shows the receiver operating curves for model 1 and model 2. The two receiver operating curves were almost identical, with an AUC on 62.9% for model 1 and 62.8% for model 2 (Fig 2, Table 3).
Model 1 includes 11 maternal and perinatal characteristics. Model 2 includes 32 maternal, obstetrical, and perinatal characteristics.
Table 3 shows the performance metrics for model 1 and model 2 in the dataset for model validation. The models performed similar across all performance metrics except sensitivity and specificity. The sensitivity of model 1 was 34.6%, whereas the sensitivity of model 2 was 25.5%. The specificity of model 1 was 80.5%, whereas the specificity of model 2 was 86.9%.
Fig 3 shows the feature importance plot for model 1. The sequential rank agreement was stable across the four highest ranking predictors (S1 Fig), which were birthplace, maternal education, maternal body mass index, and cesarean section.
Model 1 includes 11 maternal and perinatal characteristics (maternal age, maternal pre-pregnancy body mass index, maternal smoking, maternal education, birthplace, parity, multiple birth, delivery mode, infant’s sex, gestational age, and birthweight). The sequential rank agreement was stable across the 4 highest ranking predictors.
Fig 4 shows the feature importance plot for model 2. The sequential rank agreement was stable across the eleven highest ranking predictors (S2 Fig), which were birthplace, maternal education, maternal body mass index, cesarean section, maternal smoking, regional anesthesia, maternal age, multiple birth, gestational age, labor induction, and parity.
Model 2 includes 32 maternal, obstetrical, and perinatal characteristics (maternal age, maternal pre-pregnancy body mass index, maternal smoking, maternal education, birthplace, parity, multiple birth, delivery mode, infant’s sex, gestational age, birthweight, ethnicity, maternal psychiatric disease, maternal somatic chronic disease, preeclampsia and eclampsia, hemorrhage in early pregnancy, gestational diabetes mellitus, liver disease, hemorrhage in late pregnancy, preterm premature rupture of membranes, placenta previa, abruptio placenta, abnormal forces of labor, uterine rupture, postpartum hemorrhage, retention of placenta or membranes, perineal tear, labor induction, forceps or vacuum extraction, regional anesthesia, general anesthesia, and Apgar score). The sequential rank agreement was stable across the 11 highest ranking predictors.
Discussion
We used machine learning techniques to develop and validate two models to predict cessation of exclusive breastfeeding within one month using a nationwide cohort including more than 110,000 infants. The two models exhibited a relatively low accuracy in prediction of exclusive breastfeeding cessation within one month. Intriguingly, the inclusion of 21 additional factors in the second model did not result in improved predictive performance.
The register-based design presents some general advantages including prospective data collection in a real-world setting [41]. The quality of breastfeeding data is especially susceptible to selection, self-reporting, and recall bias [42, 43]. In the present study, selection bias was minimized by the nationwide study population and ability to handle censoring using the unique Central Personal Registration number. The use of routinely collected exclusive breastfeeding data by health visitors contributes to diminishing self-reporting and recall bias.
We applied random forest machine learning to build the prediction models, which holds several advantages compared to traditional statistics including the ability to take non-linearity and interaction into account [36]. We chose the random forest algorithm because of its widespread acceptance as a traditional machine learning method, recognized for its flexibility and interpretability [44]. We did not compare multiple machine learning algorithms, as the scope of the study was to investigate whether the dataset could be employed to predict exclusive breastfeeding cessation within one month. It is highly unlikely that other algorithms would yield models with a clinically relevant increase in performance.
The main limitation of the study is the potential misclassification of cessation of exclusive breastfeeding within one month. The structure of The Danish National Child Health Register entails that mother-infant pairs who did not initiate exclusive breastfeeding not were included in the register [29]. Accordingly, we classified mother-infant pairs without a record in the register as having ceased exclusive breastfeeding within one month. However, other factors might result in missing records, e.g., rejection of health visitor services or errors in reporting to the register. More than 95% of Danish parents use health visitor services, but we cannot exclude that there might be slight deviations from the recommended five visits in the infant’s first year [8].
We applied data from 2014 and 2015 to develop the models due to the data validity in these years. It is important to consider the possibility that data could be outdated and thereby affect the predictive performance in the present. We consider it unlikely that data should be outdated, as there have been no substantial changes in obstetric care, neonatal care, or breastfeeding support in Denmark during this period. However, it would be important to consider this in case of an implementation phase.
We expected that the high number of study participants and inclusion of potentially important predictors would enable us to create a valuable prediction model to select mother-infant pairs for targeted breastfeeding supportive interventions. We consider that our results corroborate that breastfeeding success depend on factors that were not encompassed in our dataset. This is a specific limitation of our method and a general limitation of machine learning models used to predict specific outcomes. The register-based design invokes general limitations including data being pre-collected by others [41], which confines us to the predictors that are available in the registers. Our two models both predicted cessation of exclusive breastfeeding with areas under the receiver operating curves on approximately 62%. To our knowledge, the best performing models made in previous studies have predicted exclusive breastfeeding with AUCs between 74% and 78% [16, 17, 20]. In addition to socio-demography, pregnancy and birth-related data, these models comprise information on breastfeeding practices, e.g., breastfeeding intention, previous breastfeeding experience, breastfeeding education, Baby-friendly Hospital Initiative designation, skin-to-skin contact between mother and infant, early breastfeeding initiation, and maternal self-efficacy. In Denmark, nearly all expectant mothers intend to breastfeed, and no hospitals hold a valid Baby-friendly Hospital Initiative designation [4]. Thus, we speculate that data on previous breastfeeding experience, breastfeeding education, skin-to-skin contact between mother and infant, early breastfeeding initiation, and maternal self-efficacy would improve the predictive performance of our models.
Our study verified well-established predictors of breastfeeding. In both models, the most important predictors were birthplace, maternal body mass index, maternal smoking, and cesarean section. Multiple studies have shown that maternal body mass index, maternal smoking, and cesarean section are associated with exclusive breastfeeding [9, 12]. Denmark is divided in five health regions. We were surprised that there were major differences between these. We did not explore this because we focused the analyses on factors that could be generalized to other countries. We speculate that birthplace cover multiple aspects including socioeconomic status, local approach to breastfeeding, and completeness of data reported to The Danish National Child Health Register.
Breastfeeding supportive interventions, e.g., parental breastfeeding education and training of nurses, can increase exclusive breastfeeding rates [45, 46]. Different strategies have been developed to target such interventions including stratification by parity. In Denmark, breastfeeding support is initiated in the hospital. Preterm and sick infants are admitted to the neonatal ward, while healthy infants are discharged directly from the delivery ward (only multiparous) or admitted to the maternity ward. Breastfeeding support continues after discharge guided by health nurses conducting free home visits. To target support interventions on an individual level would require more accurate predictions than the models we have presented can provide. If the models had demonstrated better performance, they could be applied in the hospital after birth to identify mother-infant pairs susceptible to not establishing exclusive breastfeeding. This approach would enable targeted support interventions. In the Danish system, they could include additional home visits by health visitors or additional appointments at the hospital to promote breastfeeding establishment. To evaluate the effect of this approach, the Plan-Do-Study-Act cycle could be employed [47]. We would begin with a localized implementation of the model and targeted interventions. Should the results prove positive, we would consider expanding the initiative nationally.
Contrary to our expectations, the predictive performance was not increased by inclusion of additional predictors associated with complications during pregnancy and birth. It is possible that some of the additional predictors could provide additive value in predicting other breastfeeding outcomes. It is important to underline that prediction models cannot meaningfully be used to infer anything about biology. While complications during pregnancy and birth did not increase the predictive performance it remains possible that these factors impede exclusive breastfeeding.
Conclusion
The two models developed did not accurately predict cessation of exclusive breastfeeding within one month among infants born after 35 weeks gestation. Contrary to our expectations, including additional factors in the model did not increase model performance. These findings underscore the complexity of predicting breastfeeding outcomes and emphasizes the need for further research to target breastfeeding supportive interventions.
Supporting information
S1 Table. Definitions of obstetric outcomes based on International Classification of Diseases 10th revision and NOMESCO Classification of Surgical Procedure codes in The Danish National Patient Register.
https://doi.org/10.1371/journal.pone.0312238.s001
(TIF)
S2 Table. Definition of maternal chronic somatic disease based on International Classification of Diseases 10th revision codes in The Danish National Patient Register.
https://doi.org/10.1371/journal.pone.0312238.s002
(TIF)
S3 Table. Definition of maternal psychiatric disease based on International Classification of Diseases 10th revision codes in The Danish National Patient Register.
https://doi.org/10.1371/journal.pone.0312238.s003
(TIF)
S4 Table. Performance of the models’ prediction of exclusive breastfeeding cessation within one month in the development data.
https://doi.org/10.1371/journal.pone.0312238.s004
(TIF)
S1 Fig. Sequential rank agreement for the feature importance analyses in model 1 to predict cessation of exclusive breastfeeding within one month.
https://doi.org/10.1371/journal.pone.0312238.s005
(TIF)
S2 Fig. Sequential rank agreement for the feature importance analyses in model 2 to predict cessation of exclusive breastfeeding within one month.
https://doi.org/10.1371/journal.pone.0312238.s006
(TIF)
References
- 1. Victora CG, Bahl R, Barros AJ, França GV, Horton S, Krasevec J, et al. Breastfeeding in the 21st century: epidemiology, mechanisms, and lifelong effect. Lancet. 2016;387(10017):475–90. Epub 2016/02/13. pmid:26869575.
- 2. Ip S, Chung M, Raman G, Chew P, Magula N, DeVine D, et al. Breastfeeding and maternal and infant health outcomes in developed countries. Evid Rep Technol Assess (Full Rep). 2007;(153):1–186. Epub 2007/09/04. pmid:17764214; PubMed Central PMCID: PMC4781366.
- 3.
World Health Organization. Breastfeeding 2023 [cited 2023 25th of July]. Available from: https://www.who.int/health-topics/breastfeeding#tab=tab_2.
- 4.
Sundhedsstyrelsen. Amning—en håndbog for sundhedspersonale. 2021.
- 5. Nilsson I, Busck-Rasmussen M, Kronborg H. National klinisk retningslinje om etablering af amning efter fødsel. 2019.
- 6. Bruun S, Wedderkopp N, Mølgaard C, Kyhl HB, Zachariassen G, Husby S. Using text messaging to obtain weekly data on infant feeding in a Danish birth cohort resulted in high participation rates. Acta Paediatr. 2016;105(6):648–54. Epub 2016/03/02. pmid:26928297.
- 7. Bruun S, Buhl S, Husby S, Jacobsen LN, Michaelsen KF, Sørensen J, et al. Breastfeeding, Infant Formula, and Introduction to Complementary Foods-Comparing Data Obtained by Questionnaires and Health Visitors’ Reports to Weekly Short Message Service Text Messages. Breastfeed Med. 2017;12(9):554–60. Epub 2017/08/24. pmid:28832183.
- 8.
Pedersen T, Pant S, Ammitzbøll J. Sundhedsplejerskers bemærkninger til motorisk udvikling i det første leveår. Temarapport. Børn født i 2017. Databasen Børns Sundhed and Statens Institut for Folkesundhed, SDU, 2019.
- 9. Cohen SS, Alexander DD, Krebs NF, Young BE, Cabana MD, Erdmann P, et al. Factors Associated with Breastfeeding Initiation and Continuation: A Meta-Analysis. J Pediatr. 2018;203:190–6.e21. Epub 2018/10/09. pmid:30293638.
- 10. Maastrup R, Hansen BM, Kronborg H, Bojesen SN, Hallum K, Frandsen A, et al. Breastfeeding progression in preterm infants is influenced by factors in infants, mothers and clinical practice: the results of a national cohort study with high breastfeeding initiation rates. PLoS One. 2014;9(9):e108208. Epub 2014/09/25. pmid:25251690; PubMed Central PMCID: PMC4177123.
- 11. Feldman-Winter L, Kellams A, Peter-Wohl S, Taylor JS, Lee KG, Terrell MJ, et al. Evidence-Based Updates on the First Week of Exclusive Breastfeeding Among Infants ≥35 Weeks. Pediatrics. 2020;145(4). Epub 2020/03/13. pmid:32161111.
- 12. Turcksin R, Bel S, Galjaard S, Devlieger R. Maternal obesity and breastfeeding intention, initiation, intensity and duration: a systematic review. Matern Child Nutr. 2014;10(2):166–83. Epub 2012/08/22. pmid:22905677; PubMed Central PMCID: PMC6860286.
- 13. Jones JR, Kogan MD, Singh GK, Dee DL, Grummer-Strawn LM. Factors associated with exclusive breastfeeding in the United States. Pediatrics. 2011;128(6):1117–25. Epub 2011/11/30. pmid:22123898.
- 14. Nejsum FM, Måstrup R, Torp-Pedersen C, Løkkegaard ECL, Wiingreen R, Hansen BM. Exclusive breastfeeding: Relation to gestational age, birth weight, and early neonatal ward admission. A nationwide cohort study of children born after 35 weeks of gestation. PLoS One. 2023;18(5):e0285476. Epub 20230524. pmid:37224110; PubMed Central PMCID: PMC10208505.
- 15.
World Health Organization. Ten steps to successful breastfeeding 2023 [cited 2023 January, 14th]. Available from: https://www.who.int/teams/nutrition-and-food-safety/food-and-nutrition-actions-in-health-systems/ten-steps-to-successful-breastfeeding.
- 16. Oliver-Roig A, Rico-Juan JR, Richart-Martínez M, Cabrero-García J. Predicting exclusive breastfeeding in maternity wards using machine learning techniques. Comput Methods Programs Biomed. 2022;221:106837. Epub 20220426. pmid:35544962.
- 17. Ballesta-Castillejos A, Gómez-Salgado J, Rodríguez-Almagro J, Hernández-Martínez A. Development and validation of a predictive model of exclusive breastfeeding at hospital discharge: Retrospective cohort study. Int J Nurs Stud. 2021;117:103898. Epub 2021/02/27. pmid:33636452.
- 18. Berra S, Rajmil L, Passamonte R, Fernandez E, Sabulsky J. Premature cessation of breastfeeding in infants: development and evaluation of a predictive model in two Argentinian cohorts: the CLACYD study, 1993–1999. Córdoba Lactation, Feeding, Growth and Development study. Acta Paediatr. 2001;90(5):544–51. Epub 2001/06/30. pmid:11430715.
- 19. Wang Y, Shan C, Zhang Y, Ding L, Wen J, Tian Y. Early Recognition of the Preference for Exclusive Breastfeeding in Current China: A Prediction Model based on Decision Trees. Sci Rep. 2020;10(1):6720. Epub 2020/04/23. pmid:32317667; PubMed Central PMCID: PMC7174406.
- 20. Kronborg H, Vaeth M. Validation of the Breastfeeding Score-A Simple Screening Tool to Predict Breastfeeding Duration. Nutrients. 2019;11(12). Epub 2019/11/27. pmid:31766388; PubMed Central PMCID: PMC6950692.
- 21. Juul SE, Wood TR, German K, Law JB, Kolnik SE, Puia-Dumitrescu M, et al. Predicting 2-year neurodevelopmental outcomes in extremely preterm infants using graphical network and machine learning approaches. EClinicalMedicine. 2023;56:101782. Epub 20221226. pmid:36618896; PubMed Central PMCID: PMC9813758.
- 22. Javaid M, Haleem A, Singh R, Suman R, Rab S. Significance of machine learning in healthcare: Features, pillars and applications. 2022.
- 23. Di Martino F, Delmastro F. Explainable AI for clinical and remote health applications: a survey on tabular and time series data. Artificial Intelligence Review. 2023;56(6):5261–315. pmid:36320613
- 24. Pedersen CB. The Danish Civil Registration System. Scand J Public Health. 2011;39(7 Suppl):22–5. Epub 2011/08/04. pmid:21775345.
- 25. Marsál K, Persson PH, Larsen T, Lilja H, Selbing A, Sultan B. Intrauterine growth curves based on ultrasonically estimated foetal weights. Acta Paediatr. 1996;85(7):843–8. Epub 1996/07/01. pmid:8819552.
- 26. Andersen MP, Wiingreen R, Eroglu TE, Christensen HC, Polcwiartek LB, Blomberg SNF, et al. The Danish National Child Health Register. Clin Epidemiol. 2023;15:1087–94. Epub 20231114. pmid:38025840; PubMed Central PMCID: PMC10656863.
- 27. Sundhedsstyrelsen. Amning—En håndbog for sundhedspersonale. 5 ed2021.
- 28.
World Health Organization. Indicators for assessing infant and young child feeding practices: conclusions of a Consensus Meeting held 6–8 November 2007 in Washington D.C., USA. 2008.
- 29.
Sundhedsdatastyrelsen. Den Nationale Børnedatabase (BDB) 2022 [updated March 15, 2022; cited 2023 January 14]. Available from: https://sundhedsdatastyrelsen.dk/da/registre-og-services/om-de-nationale-sundhedsregistre/graviditet-foedsler-og-boern/boernedatabasen.
- 30.
UNESCO Institute for Statistics. International Standard Classification of Education ISCED 2011. 2012.
- 31.
Regionernes kliniske kvalitetsudviklingsprogram. Dansk Føtalmedicinsk Database (FØTO-databasen). 2020.
- 32. van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software. 2011;45(3):1–67.
- 33.
Rubin DB. Multiple Imputation for Nonresponse in Surveys: John Wiley & Sons Inc; 1987.
- 34. Breiman L. Random Forests. Machine Learning. 2001;45(1):5–32.
- 35. Kuhn M. Building Predictive Models in R Using the caret Package. Journal of Statistical Software. 2008;28.
- 36. Gerds T, Kattan M. Medical Risk Prediction Models With Ties to Machine Learning. 2022.
- 37. Ekstrøm CT, Gerds TA, Jensen AK. Sequential rank agreement methods for comparison of ranked lists. Biostatistics. 2018;20(4):582–98. pmid:29868883
- 38.
R Core Team. 2022. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: https://www.R-project.org/.
- 39. Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration. Int J Surg. 2014;12(12):1500–24. Epub 2014/07/22. pmid:25046751.
- 40. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Medicine. 2015;13(1):1. pmid:25563062
- 41. Thygesen LC, Ersbøll AK. When the entire population is the sample: strengths and limitations in register-based epidemiology. Eur J Epidemiol. 2014;29(8):551–8. Epub 2014/01/11. pmid:24407880.
- 42. Flaherman VJ CA, McCulloch CE, Dudley RA. Breastfeeding rates differ significantly by method used: a cause for concern for public health measurement. Breastfeed Med. 2011;2011 Feb;6(1):31–5. Epub 2010 Nov 20. pmid:21091054
- 43. Bland RM, Rollins NC, Solarsh G, Van den Broeck J, Coovadia HM. Maternal recall of exclusive breast feeding duration. Arch Dis Child. 2003;88(9):778–83. Epub 2003/08/26. pmid:12937095; PubMed Central PMCID: PMC1719625.
- 44. Boulesteix A-L, Janitza S, Kruppa J, König IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. WIREs Data Mining and Knowledge Discovery. 2012;2(6):493–507.
- 45. Maastrup R, Rom AL, Walloee S, Sandfeld HB, Kronborg H. Improved exclusive breastfeeding rates in preterm infants after a neonatal nurse training program focusing on six breastfeeding-supportive clinical practices. PLoS One. 2021;16(2):e0245273. Epub 20210203. pmid:33534831; PubMed Central PMCID: PMC7857627.
- 46. Haroon S, Das JK, Salam RA, Imdad A, Bhutta ZA. Breastfeeding promotion interventions and breastfeeding practices: a systematic review. BMC Public Health. 2013;13 Suppl 3(Suppl 3):S20. Epub 2014/02/26. pmid:24564836; PubMed Central PMCID: PMC3847366.
- 47.
Langley GJ, Moen R, Nolan KM, Nolan TW, Norman CL, Provost LP. The Improvement Guide: A Practical Approach to Enhancing Organizational Performance: Wiley; 2009.