Skip to main content
Advertisement
  • Loading metrics

Community-engaged artificial intelligence research: A scoping review

  • Tyler J. Loftus,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Jeremy A. Balch,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Kenneth L. Abbott,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Die Hu,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Matthew M. Ruppert,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America, College of Medicine, University of Central Florida, Orlando, Florida, United States of America

  • Benjamin Shickel,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America

  • Tezcan Ozrazgat-Baslanti,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America

  • Philip A. Efron,

    Roles Project administration, Supervision, Writing – review & editing

    Affiliation Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Patrick J. Tighe,

    Roles Resources, Supervision, Writing – review & editing

    Affiliation Departments of Anesthesiology, Orthopedics, and Information Systems/Operations Management, University of Florida Health, Gainesville, Florida, United States of America

  • William R. Hogan,

    Roles Resources, Supervision, Writing – review & editing

    Affiliation Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, United States of America

  • Parisa Rashidi,

    Roles Resources, Supervision, Writing – review & editing

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Departments of Biomedical Engineering, Computer and Information Science and Engineering, and Electrical and Computer Engineering, University of Florida, Gainesville, Florida, United States of America

  • Michelle I. Cardel,

    Roles Conceptualization, Resources, Supervision, Writing – review & editing

    Affiliation Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, United States of America

  • Gilbert R. Upchurch Jr.,

    Roles Conceptualization, Resources, Supervision, Writing – review & editing

    Affiliation Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America

  • Azra Bihorac

    Roles Conceptualization, Resources, Supervision, Writing – review & editing

    abihorac@ufl.edu

    Affiliations University of Florida Intelligent Clinical Care Center, Gainesville, Florida, United States of America, Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America, Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America

Abstract

The degree to which artificial intelligence healthcare research is informed by data and stakeholders from community settings has not been previously described. As communities are the principal location of healthcare delivery, engaging them could represent an important opportunity to improve scientific quality. This scoping review systematically maps what is known and unknown about community-engaged artificial intelligence research and identifies opportunities to optimize the generalizability of these applications through involvement of community stakeholders and data throughout model development, validation, and implementation. Embase, PubMed, and MEDLINE databases were searched for articles describing artificial intelligence or machine learning healthcare applications with community involvement in model development, validation, or implementation. Model architecture and performance, the nature of community engagement, and barriers or facilitators to community engagement were reported according to PRISMA extension for Scoping Reviews guidelines. Of approximately 10,880 articles describing artificial intelligence healthcare applications, 21 (0.2%) described community involvement. All articles derived data from community settings, most commonly by leveraging existing datasets and sources that included community subjects, and often bolstered by internet-based data acquisition and subject recruitment. Only one article described inclusion of community stakeholders in designing an application–a natural language processing model that detected cases of likely child abuse with 90% accuracy using harmonized electronic health record notes from both hospital and community practice settings. The primary barrier to including community-derived data was small sample sizes, which may have affected 11 of the 21 studies (53%), introducing substantial risk for overfitting that threatens generalizability. Community engagement in artificial intelligence healthcare application development, validation, or implementation is rare. As healthcare delivery occurs primarily in community settings, investigators should consider engaging community stakeholders in user-centered design, usability, and clinical implementation studies to optimize generalizability.

Author summary

Most healthcare is delivered in community settings, but most healthcare research involving artificial intelligence is performed within academic or university hospital settings. There may be several reasons why most artificial intelligence research does not include data or perspectives from patients, providers, or administrators from community healthcare settings, but those reasons–and potential solutions to address them–are not well understood. Therefore, we systematically searched for relevant journal articles and used information from those articles to summarize the barriers and facilitators to community-engaged artificial intelligence research. We found that of approximately 10,880 articles describing artificial intelligence healthcare applications, only 21 (0.2%) described community involvement. Among those 21 articles, all derived data from community settings, but more than half used datasets that were probably too small for the model to perform at its best. Only one article described inclusion of community stakeholders in designing an application, which was accomplished by engaging community pediatricians and county Child Protective Services in designing an artificial intelligence model to detect child abuse in both hospital and community settings. As healthcare delivery occurs primarily in community settings, investigators should increase engagement with community stakeholders in designing and implementing artificial intelligence tools that serve everyone.

Introduction

Artificial intelligence–computers performing tasks by mimicking human intelligence–is changing healthcare delivery [1]. By discovering complex, nonlinear associations, artificial intelligence algorithms often outperform simple additive models and rule-based inference engines [2,3]. To achieve equality in predictive performance, algorithms must be trained on datasets that accurately represent the patients to whom the algorithm will be applied; failure to meet this requirement risks performance degradation for rare cases and vulnerable populations [4,5].

Community-engaged research–which involves key stakeholders (e.g., patients, healthcare providers, administrators, and researchers) from community settings (here, community refers to settings outside academic hospitals)–should be allied with artificial intelligence research. Community engagement ensures that artificial intelligence tools are both generalizable to, and reproducible in, the most common site of healthcare delivery. Community engagement also has the potential to mitigate bias against underrepresented groups, which is already present in some AI tools. Academic centers performing artificial intelligence research may see patient populations that differ from those in surrounding communities; if those centers do not enroll patients whose socioeconomic and insurance status reflects the general public, dataset bias may result [611]. While training algorithms on datasets generated exclusively in academic centers could worsen healthcare disparities, community-engaged research could help to mitigate disparities by anchoring clinical decisions to accurate and objective predictions.

The degree to which contemporary artificial intelligence research involves community stakeholders has not been previously described, and could represent an important opportunity to improve scientific quality and the effectiveness of artificial intelligence-enabled tools, given its effectiveness in other domains [12,13]. This scoping review systematically maps what is known and unknown about community-engaged artificial intelligence research and identifies opportunities to optimize the generalizability of these applications through involvement of community stakeholders and data throughout model development, validation, and implementation.

Materials and methods

Embase, PubMed, and MEDLINE databases were searched for articles describing artificial intelligence or machine learning with community involvement published between database inception and January 18th, 2023. Clinically-oriented databases of peer-reviewed articles were selected, rather than more technical, non-clinical databases, because this article focuses on healthcare applications that are intended for clinical audiences and clinical use, with the rationale that more technical, non-clinical article databases primarily contain development, validation, and theoretical work rather than community-engaged clinical research. Briefly, articles were included if they: 1) described the development or validation of an artificial intelligence healthcare application (e.g., algorithm, model, or artificial intelligence-enabled decision support tool), 2) described community involvement or engagement in the form of a) accruing data from community settings for algorithm training or testing, or b) inclusion of patients, providers, or administrators from community health care settings in user-centered design, usability, or clinical implementation studies, and 3) were published in English as a peer-reviewed journal article. Article search terms were as follows (note that “ab,ti” indicates presence of the search term in the abstract or title; * is a placeholder for any string of characters, such that the “engage*” term is fulfilled by “engaged,” “engagement,” “engage,” etc.): (community:ab,ti OR rural:ab,ti) AND (engage*:ab,ti OR involve*:ab,ti) AND (’artificial intelligence’:ab,ti OR ’machine learning’:ab,ti) AND [article]/lim AND [humans]/lim AND [english]/lim AND [clinical study]/lim AND ([embase]/lim OR [medline]/lim OR [pubmed-not-medline]/lim). All articles not meeting these criteria were excluded. The search terms identified 86 articles. After removal of duplicates, 45 articles remained. Exclusions at screening and full text review phases are illustrated in S1 Fig. Institutional Review Board approval and patient consent were not applicable to this review article.

Two reviewers independently screened abstracts for all 45 non-duplicated articles. Screening disagreements were resolved by a third reviewer via arbitration. The two screening reviewers had 78% agreement and a Cohen’s Kappa statistic for inter-rater reliability of 0.56, indicating moderate beyond-chance agreement [14]. Eighteen articles were excluded during the screening process because they did not meet inclusion criteria. For the remaining 27 articles, quality was rated using validated quality assessment tools [15]. Articles rated “poor” and those for which the full text did not meet inclusion criteria were excluded. Six articles were removed during full text review for not meeting inclusion criteria. Twenty-one articles remained and were included in the final analysis. Covidence software was used to organize article screening and selection as well as data extraction processes. Results were reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines, as listed in S1 Table. Sources of funding and competing interests for each included article are listed in S2 Table.

For each included article, data extraction included the study population, artificial intelligence model architecture and predictive performance, whether model development or validation included data derived from community healthcare settings, whether community stakeholders (patients, providers, or administrators from community healthcare settings) were included in user-centered design, usability, or clinical implementation studies, and any barriers or facilitators to community engagement that were described within the article.

Finally, a separate search was performed to approximate the number of all articles describing artificial intelligence or machine learning healthcare applications, regardless of whether they reported community involvement or engagement, published between database inception and January 18th, 2023. The purpose of this search was to provide the denominator for calculating the proportion of artificial intelligence applications that describe community involvement. In this separate search, articles: 1) described the development or validation of an artificial intelligence healthcare application (e.g., algorithm, model, or artificial intelligence-enabled decision support tool) and 2) were published in English as a peer-reviewed journal article (i.e., mirroring the search for community-engaged artificial intelligence articles, but without the community engagement elements). This search identified 20,791 articles. Assuming this search yielded duplicates with a frequency that was comparable to the frequency of duplicates in the search for community-engaged artificial intelligence articles, there were approximately 10,880 non-duplicated articles describing artificial intelligence or machine learning healthcare applications.

Results

Approximate proportion of all artificial intelligence healthcare research with community engagement

Based on the frequency of duplicates in the search for community-engaged artificial intelligence articles, we estimated that from inception to January 18th, 2023, Embase, PubMed, and MEDLINE databases contained approximately 10,880 non-duplicated articles describing artificial intelligence healthcare applications. Twenty-one of these articles (0.2%) described community-engaged artificial intelligence healthcare applications. All subsequent analyses refer to these 21 articles, which are summarized in Table 1. Eleven different countries are represented by the primary affiliations of the 21 first authors (Australia, China, Denmark, Germany, Ghana, India, Norway, Singapore, South Korea, Spain, and the United States of America). Community engagement themes and their role in artificial intelligence-enabled healthcare applications are illustrated in Fig 1.

thumbnail
Fig 1. Community engagement themes and their role in artificial intelligence-enabled healthcare applications.

https://doi.org/10.1371/journal.pdig.0000561.g001

Inclusion of data derived from community healthcare settings

All 21 studies included data derived from community healthcare settings. Six of the 21 studies (29%) compared cases or intervention group subjects with healthy control subjects. Of these six studies, 1 (17%) recruited all study subjects from community settings; [23] 4 (67%) recruited control subjects from community settings while recruiting cases or intervention group subjects from other clinical settings; [16,21,25,35] 1 (17%) recruited both cases and controls from community and non-community settings [34]. Fifteen of the 21 studies (71%) reported primary analyses of all study subjects as a single cohort. Of these 15 studies, 13 (87%) recruited all study subjects from community settings [1821,24,2630,32,33,36] and 2 (13%) recruited subjects from community and non-community settings [17,31].

Inclusion of community stakeholders in user-centered design, usability, or clinical implementation studies

Only one study included community stakeholders in user-centered design, usability, or clinical implementation and deserves special mention. Annapragada et al. [17] developed a bag-of-words natural language processing model that detected cases of likely child abuse with accuracy 0.90±0.02 and area under the receiver operating characteristic curve (AUROC) 0.93±0.02. In addition to including cases from both hospital departments and smaller community settings, the prediction framework was developed with community engagement in mind. The authors note, “while large referral hospitals can maintain teams trained in Child Abuse Pediatrics (CAP), smaller community hospitals rarely have such resources, making the consistent detection of and response to subtle signs and symptoms of abuse difficult.” To offer similar protections for children both within and outside of large hospital settings, the authors trained and tested the prediction model on free text from pediatric electronic health records in both settings, using records from first contact to involvement of the child protection team. Community stakeholders included community pediatricians and county Child Protective Services. Although this study did not report user-centered design or usability experiments, it did include community stakeholders in developing a modeling approach used in implementation experiments, and was therefore classified as having stakeholder engagement.

Facilitators and barriers to community engagement

The most common facilitator to including community-derived data was using an existing dataset that included community subjects. This approach was used in 6 of 21 studies (29%) [20,22,24,28,33,34]. The next most common facilitator was developing a novel dataset from existing data sources that included community subjects. This approach was used in 4 studies (19%) [17,18,31,36]. Internet-based publicly available sources were used for dataset generation or subject recruitment in 4 studies (19%) [18,21,30,36]. Convenience sampling was used in 3 studies (14%), which improved ease and efficiency but also introduced sampling bias [16,26,30]. To mitigate sampling bias, 3 studies (14%) used random or stratified sampling to identify representative subgroups of larger populations [27,29,32]. Subjects were recruited directly from other ongoing or completed studies in 2 studies (10%) [23,35]. Investigators traveled into communities (e.g., door-to-door or community restaurants) in 1 study (5%) [27].

The major barrier to performing community-engaged artificial intelligence research was small sample sizes. Eleven of 21 studies (53%) had overall sample sizes less than 2,000, risking overfitting [37,38] (i.e., learning associations or spurious correlations between inputs and outcomes that are not generalizable and rarely observed during external validation) [16,17,19,21,23,25,26,28,30,31,33]. Overfitting can be mitigated by regularization, cross-validation, and a reduction in model complexity. Overfitting is not always problematic, as some artificial intelligence models are intended for understanding associations within a study rather than producing generalizable knowledge. Despite this, small sample sizes may have affected more than half of all included studies. Additional challenges included sampling and selection bias being imparted by convenience sampling and surveys with low response rates, as well as a general lack of interoperability for deploying artificial intelligence tools in multiple environments without additional, special effort toward data harmonization.

Discussion

The major finding from this study was that the incidence of community engagement in developing, validating, or implementing artificial intelligence applications was extraordinarily low. Almost all observed community engagement took the form of including data from community healthcare settings, with only one study explicitly included community stakeholders in user-centered design, usability, or clinical implementation. Most artificial intelligence applications focused on primary care, which typically involves longitudinal care provided in communities outside of hospital settings, which is conducive to community engagement relative to acute or emergency care, which typically involves intermittent care provided in hospitals. The most common facilitators of using community-derived data were leveraging an existing dataset that included community subjects or generating a novel dataset from sources that represent community subjects, especially using internet-based subject recruitment and data acquisition strategies. Many studies in which the investigators generated their own dataset had small sample sizes, risking overfitting. In addition, several studies performed convenience sampling or received low survey response rates, risking sampling and selection bias. As is often seen in contemporary analyses of artificial intelligence and digital health tools, we observed a general lack of interoperability.

We are unaware of any prior reviews on this topic. Although artificial intelligence modeling gained major performance advantages in 2012 and became prominent in healthcare literature over the ensuing decade, the maturation process of incorporating best practices from other fields–like community-engaged research–is ongoing [39]. We hope that our review will encourage community engagement in the future development, validation, and implementation of artificial intelligence healthcare applications.

Although there are inadequate examples in published literature to make evidence-based recommendations for best practices in community-engaged artificial intelligence, several potentially important themes emerge from this review. For model development, to obtain adequately sized training datasets that include community subjects, it seems advantageous to use large, existing datasets or harmonized electronic health record-derived data from multiple institutions [37,38,4042]. Prospectively enrolling individual patients may be useful for validation and implementation studies, but resource requirements may preclude enrolling thousands of subjects during model development stages. Although not represented in the included studies, transfer learning (i.e., source models are trained on large datasets and then fine-tuned on smaller datasets of interest, like smaller community-derived datasets) could also address both sample size and generalizability issues [4346]. Community stakeholders are an underutilized resource in model development and should be engaged early in any design process. Another potential strategy to promote engagement of community stakeholders is citizen science (i.e., scientific analysis of real-world data by members of the general public), which can expand the role of community members to active and equal members of research and technology development teams [4749]. Each of these strategies has the potential to increase health equity by promoting the development, validation, and implementation of artificial intelligence tools that have all users in mind. Finally, community engagement should be encouraged in all healthcare application development, as it contributes to novelty and generalizability of the research product [5053].

Despite relatively broad inclusion criteria, this study was limited by the small number of included studies. Although the small number of included studies could indicate that more time must pass before it would be appropriate to review community-engaged artificial intelligence healthcare applications, we see value in an early description of published work that highlights the paucity of evidence and identifies barriers and facilitators to future research. Knowledge of these themes may encourage investigators to accelerate the development, validation, and implementation of community-engaged artificial intelligence research. In addition, this review does not include more technical, non-clinical peer-reviewed journals, given the difficulty in replicating search parameters when surveying non-clinical bibliographic databases and because of our focus on intended clinical use.

Conclusions

Community engagement in artificial intelligence healthcare application development, validation, and implementation is rare. Harmonized electronic health records from community care settings and large, existing datasets that include community subjects offer opportunities to train models on data that accurately represent community settings, without risk of overfitting and loss of generalizability. It may be advantageous to not only represent community subjects in model training, but also to engage community stakeholders–patients, providers, and administrators–in user-centered design, usability, or clinical implementation studies to ensure that artificial intelligence applications perform well not only in academic hospital settings, but also in community hospitals and clinics, where most healthcare is delivered.

Supporting information

S1 Table. Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist.

https://doi.org/10.1371/journal.pdig.0000561.s002

(DOCX)

S2 Table. Sources of funding and competing interests for included studies.

https://doi.org/10.1371/journal.pdig.0000561.s003

(DOCX)

Acknowledgments

The authors thank members of the University of Florida Intelligent Clinical Care Center for providing administrative support for this work.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References

  1. 1. Emanuel EJ, Wachter RM. Artificial Intelligence in Health Care: Will the Value Match the Hype? JAMA. 2019;321(23):2281–2. pmid:31107500
  2. 2. Loftus TJ, Tighe PJ, Filiberto AC, Efron PA, Brakenridge SC, Mohr AM, et al. Artificial Intelligence and Surgical Decision-Making. JAMA Surg. 2019.
  3. 3. Hashimoto DA, Rosman G, Rus D, Meireles OR. Artificial Intelligence in Surgery: Promises and Perils. Ann Surg. 2018;268(1):70–6. pmid:29389679
  4. 4. Angwin J, Larson J, Mattu S, Kirchner L. Machine bias. ProPublica. 2016 May 23:Accessed 24 Jan 2019 [https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing].
  5. 5. Loftus TJ, Tighe PJ, Ozrazgat-Baslanti T, Davis JP, Ruppert MM, Ren Y, et al. Ideal algorithms in healthcare: Explainable, dynamic, precise, autonomous, fair, and reproducible. PLOS Digital Health. 2022;1(1):e0000006. pmid:36532301
  6. 6. Adamson AS, Smith A. Machine Learning and Health Care Disparities in Dermatology. JAMA Dermatol. 2018;154(11):1247–8. pmid:30073260
  7. 7. Benjamin R. Assessing risk, automating racism. Science. 2019;366(6464):421–2. pmid:31649182
  8. 8. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53. pmid:31649194
  9. 9. Popejoy AB, Ritter DI, Crooks K, Currey E, Fullerton SM, Hindorff LA, et al. The clinical imperative for inclusivity: Race, ethnicity, and ancestry (REA) in genomics. Human Mutation. 2018;39(11):1713–20. pmid:30311373
  10. 10. Criado-Perez C. Invisible women: data bias in a world designed for men. New York: Abrams Press; 2019. xv, 411 pages p.
  11. 11. Asch SM, Kerr EA, Keesey J, Adams JL, Setodji CM, Malik S, et al. Who is at greatest risk for receiving poor-quality health care? New Engl J Med. 2006;354(11):1147–56. pmid:16540615
  12. 12. Khazanie P, Skolarus LE, Barnes GD. In Pursuit of Health: Implementation Science and Community-Engaged Research in Cardiovascular Medicine. Circ Cardiovasc Qual Outcomes. 2022;15(11):e009694. pmid:36378769
  13. 13. Gibbons GH, Perez-Stable EJ. Harnessing the Power of Community-Engaged Research. Am J Public Health. 2024;114(S1):S7–S11. pmid:38207272
  14. 14. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276–82. pmid:23092060
  15. 15. National Heart, Lung, and Blood Institute. [Available from: https://www.nhlbi.nih.gov/health-topics/study-quality-assessment-tools, accessed 9/25/2022.
  16. 16. Adua E, Kolog EA, Afrifa-Yamoah E, Amankwah B, Obirikorang C, Anto EO, et al. Predictive model and feature importance for early detection of type II diabetes mellitus. Translational Medicine Communications. 2021;6(1):1–15.
  17. 17. Annapragada AV, Donaruma-Kwoh MM, Annapragada AV, Starosolski ZA. A natural language processing and deep learning approach to identify child abuse from pediatric electronic medical records. PLoS One. 2021;16(2):e0247404. pmid:33635890
  18. 18. Astley CM, Tuli G, Mc Cord KA, Cohn EL, Rader B, Varrelman TJ, et al. Global monitoring of the impact of the COVID-19 pandemic through online surveys sampled from the Facebook user base. P Natl Acad Sci USA. 2021;118(51). pmid:34903657
  19. 19. Beevers CG, Mullarkey MC, Dainer-Best J, Stewart RA, Labrada J, Allen JJB, et al. Association Between Negative Cognitive Bias and Depression: A Symptom-Level Approach. Journal of Abnormal Psychology. 2019;128(3):212–27. pmid:30652884
  20. 20. Bharat C, Glantz MD, Aguilar-Gaxiola S, Alonso J, Bruffaerts R, Bunting B, et al. Development and evaluation of a risk algorithm predicting alcohol dependence after early onset of regular alcohol use. Addiction. 2023. pmid:36609992
  21. 21. Brink-Kjaer A, Gupta N, Marin E, Zitser J, Sum-Ping O, Hekmat A, et al. Ambulatory Detection of Isolated Rapid-Eye-Movement Sleep Behavior Disorder Combining Actigraphy and Questionnaire. Mov Disord. 2023;38(1):82–91. pmid:36258659
  22. 22. Caballero FF, Soulis G, Engchuan W, Sanchez-Niubo A, Arndt H, Ayuso-Mateos JL, et al. Advanced analytical methodologies for measuring healthy ageing and its determinants, using factor analysis and machine learning techniques: the ATHLOS project. Sci Rep. 2017;7:43955. pmid:28281663
  23. 23. Clausen AN, Aupperle RL, Yeh HW, Waller D, Payne J, Kuplicki R, et al. Machine Learning Analysis of the Relationships Between Gray Matter Volume and Childhood Trauma in a Transdiagnostic Community-Based Sample. Biol Psychiat-Cogn N. 2019;4(8):734–42.
  24. 24. Fukaya E, Flores A, Lindholm D, Gustafsson S, Ingelsson E, Leeper N. Clinical and genetic determinants of varicose veins: a prospective, community-based prospective study of similar to 500,000 individuals. Vasc Med. 2018;23(3):300–.
  25. 25. Johannesen JK, Bi J, Jiang R, Kenney JG, Chen CA. Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults. Neuropsychiatr Electrophysiol. 2016;2:3. pmid:27375854
  26. 26. Kim J, Gwak D, Kim S, Gang M. Identifying the suicidal ideation risk group among older adults in rural areas: Developing a predictive model using machine learning methods. Journal of Advanced Nursing. 2022. pmid:36534434
  27. 27. Liu CW, Chen LN, Anwar A, Zhao BL, Lai CKY, Ng WH, et al. Comparing organ donation decisions for next-of-kin versus the self: results of a national survey. Bmj Open. 2021;11(11). pmid:34785552
  28. 28. Moberget T, Alnaes D, Kaufmann T, Doan NT, COrdova-Palomera A, Norbom LB, et al. Cerebellar Gray Matter Volume Is Associated With Cognitive Function and Psychopathology in Adolescence. Biol Psychiat. 2019;86(1):65–75. pmid:30850129
  29. 29. Qian X, Li Y, Zhang X, Guo H, He J, Wang X, et al. A Cardiovascular Disease Prediction Model Based on Routine Physical Examination Indicators Using Machine Learning Methods: A Cohort Study. Front Cardiovasc Med. 2022;9:854287. pmid:35783868
  30. 30. Schwartz AR, Cohen-Zion M, Pham LV, Gal A, Sowho M, Sgambati FP, et al. Brief digital sleep questionnaire powered by machine learning prediction models identifies common sleep disorders. Sleep Med. 2020;71:66–76. pmid:32502852
  31. 31. Shah AA, Karhade AV, Bono CM, Harris MB, Nelson SB, Schwab JH. Development of a machine learning algorithm for prediction of failure of nonoperative management in spinal epidural abscess. Spine J. 2019;19(10):1657–65. pmid:31059819
  32. 32. Tu YY, Hu XM, Zeng CQ, Ye MH, Zhang P, Jin XQ, et al. A machine-learning approach to discerning prevalence and causes of myopia among elementary students in Hubei. International Ophthalmology. 2022;42(9):2889–902. pmid:35391585
  33. 33. Walambe R, Nayak P, Bhardwaj A, Kotecha K. Employing Multimodal Machine Learning for Stress Detection. J Healthc Eng. 2021;2021:9356452. pmid:34745514
  34. 34. Yan Y, Schaffter T, Bergquist T, Yu T, Prosser J, Aydin Z, et al. A Continuously Benchmarked and Crowdsourced Challenge for Rapid Development and Evaluation of Models to Predict COVID-19 Diagnosis and Hospitalization. Jama Network Open. 2021;4(10). pmid:34633425
  35. 35. Zee B, Lee JC, Lai MR, Chee P, Rafferty J, Thomas R, et al. Digital solution for detection of undiagnosed diabetes using machine learning-based retinal image analysis. Bmj Open Diab Res Ca. 2022;10(6). pmid:36549873
  36. 36. Zhu JM, Sarker A, Gollust S, Merchant R, Grande D. Characteristics of Twitter Use by State Medicaid Programs in the United States: Machine Learning Approach. Journal of Medical Internet Research. 2020;22(8). pmid:32804085
  37. 37. Balki I, Amirabadi A, Levman J, Martel AL, Emersic Z, Meden B, et al. Sample-Size Determination Methodologies for Machine Learning in Medical Imaging Research: A Systematic Review. Can Assoc Radiol J. 2019;70(4):344–53. pmid:31522841
  38. 38. Baum EB, Haussler D. What Size Net Gives Valid Generalization? Neural Computation. 1989;1(1):151–60.
  39. 39. Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. Commun Acm. 2017;60(6):84–90.
  40. 40. Cohen J. Statistical power analysis for the behavioral sciences: Routledge; 2013.
  41. 41. Figueroa RL, Zeng-Treitler Q, Kandula S, Ngo LH. Predicting sample size required for classification performance. Bmc Med Inform Decis. 2012;12. pmid:22336388
  42. 42. Loftus TJ, Shickel B, Rupert MM, Balch JA, Ozrazgat Baslanti T, Tighe PJ, et al. Uncertainty-aware deep learning in healthcare: A scoping review. PLOS Digit Health. 2022;1(8). pmid:36590140
  43. 43. Theodoris CV, Xiao L, Chopra A, Chaffin MD, Al Sayed ZR, Hill MC, et al. Transfer learning enables predictions in network biology. Nature. 2023;618(7965):616–24. pmid:37258680
  44. 44. Hu J, Li X, Hu G, Lyu Y, Susztak K, Li M. Iterative transfer learning with neural network for clustering and cell type classification in single-cell RNA-seq analysis. Nat Mach Intell. 2020;2(10):607–18. pmid:33817554
  45. 45. Ebbehoj A, Thunbo MO, Andersen OE, Glindtvad MV, Hulman A. Transfer learning for non-image data in clinical research: A scoping review. PLOS Digit Health. 2022;1(2):e0000014. pmid:36812540
  46. 46. Liu L, Meng Q, Weng C, Lu Q, Wang T, Wen Y. Explainable deep transfer learning model for disease risk prediction using high-dimensional genomic data. PLoS Comput Biol. 2022;18(7):e1010328. pmid:35839250
  47. 47. Fraisl D, Hager G, Bedessem B, Gold M, Hsing PY, Danielsen F, et al. Citizen science in environmental and ecological sciences. Nat Rev Method Prime. 2022;2(1).
  48. 48. Schafer B, Beck C, Rhys H, Soteriou H, Jennings P, Beechey A, et al. Machine learning approach towards explaining water quality dynamics in an urbanised river. Sci Rep. 2022;12(1):12346. pmid:35854053
  49. 49. Pucino N, Kennedy DM, Carvalho RC, Allan B, Ierodiaconou D. Citizen science for monitoring seasonal-scale beach erosion and behaviour with aerial drones. Sci Rep. 2021;11(1):3935. pmid:33594157
  50. 50. Baker M. 1,500 scientists lift the lid on reproducibility. Nature. 2016;533(7604):452–4. pmid:27225100
  51. 51. Balls-Berry JE, Acosta-Perez E. The Use of Community Engaged Research Principles to Improve Health: Community Academic Partnerships for Research. P R Health Sci J. 2017;36(2):84–5. pmid:28622404
  52. 52. Han HR, Xu A, Mendez KJW, Okoye S, Cudjoe J, Bahouth M, et al. Exploring community engaged research experiences and preferences: a multi-level qualitative investigation. Res Involv Engagem. 2021;7(1):19. pmid:33785074
  53. 53. Jones L, Wells K. Strategies for academic and clinician engagement in community-participatory partnered research. JAMA. 2007;297(4):407–10. pmid:17244838