Ten simple rules for the sharing of bacterial genotype—Phenotype data on antimicrobial resistance

Leonid Chindelevitch; Maarten van Dongen; Heather Graz; Antonio Pedrotta; Anita Suresh; Swapna Uplekar; Elita Jauneikaite; Nicole Wheeler

doi:10.1371/journal.pcbi.1011129

Abstract

The increasing availability of high-throughput sequencing (frequently termed next-generation sequencing (NGS)) data has created opportunities to gain deeper insights into the mechanisms of a number of diseases and is already impacting many areas of medicine and public health. The area of infectious diseases stands somewhat apart from other human diseases insofar as the relevant genomic data comes from the microbes rather than their human hosts. A particular concern about the threat of antimicrobial resistance (AMR) has driven the collection and reporting of large-scale datasets containing information from microbial genomes together with antimicrobial susceptibility test (AST) results. Unfortunately, the lack of clear standards or guiding principles for the reporting of such data is hampering the field’s advancement. We therefore present our recommendations for the publication and sharing of genotype and phenotype data on AMR, in the form of 10 simple rules. The adoption of these recommendations will enhance AMR data interoperability and help enable its large-scale analyses using computational biology tools, including mathematical modelling and machine learning. We hope that these rules can shed light on often overlooked but nonetheless very necessary aspects of AMR data sharing and enhance the field’s ability to address the problems of understanding AMR mechanisms, tracking their emergence and spread in populations, and predicting microbial susceptibility to antimicrobials for diagnostic purposes.

Author summary

The growing worldwide threat of antimicrobial resistance (AMR) makes the sharing of resistance data a priority for both researchers and public health practitioners. In particular, the growth of high-throughput sequencing data, in conjunction with AMR phenotypes, has the potential to revolutionise AMR diagnostics and surveillance. However, there is a significant heterogeneity in the ways that this type of data is currently shared, which makes it challenging to perform its analysis at scale. As both producers and users of publicly available genotype–phenotype data on AMR, we propose 10 simple rules that can mitigate this situation and nudge the field towards the adoption of better practices.

Citation: Chindelevitch L, van Dongen M, Graz H, Pedrotta A, Suresh A, Uplekar S, et al. (2023) Ten simple rules for the sharing of bacterial genotype—Phenotype data on antimicrobial resistance. PLoS Comput Biol 19(6): e1011129. https://doi.org/10.1371/journal.pcbi.1011129

Editor: Scott Markel, Dassault Systemes BIOVIA, UNITED STATES

Published: June 22, 2023

Copyright: © 2023 Chindelevitch et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: LC acknowledges funding from the MRC Centre for Global Infectious Disease Analysis (reference MR/R015600/1), jointly funded by the UK Medical Research Council (MRC) and the UK Foreign, Commonwealth & Development Office (FCDO), under the MRC/FCDO Concordat agreement and is also part of the EDCTP2 programme supported by the European Union. LC also acknowledges additional funding from FIND, the global alliance for diagnostics. AP, AS and SU acknowledge additional funding from the German Federal Ministry of Education and Research (BMBF). EJ is an Imperial College Research Fellow jointly supported by the Rosetrees Trust and the Stoneygate Trust (M683) and is affiliated with the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Healthcare Associated Infections and Antimicrobial Resistance at Imperial College London in partnership with the UK Health Security Agency (previously PHE), in collaboration with Imperial Healthcare Partners, University of Cambridge and University of Warwick. The views expressed are those of the authors and not necessarily those of the NIHR, UKHSA, the Department of Health and Social Care, or other organisations the authors are affiliated with. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: AP, AS and SU declare that they are employed by FIND, the global alliance for diagnostics. None of the other authors have anything to declare.

Introduction

Antimicrobial resistance (AMR), the phenomenon whereby microbes successfully evade the action of antimicrobial drugs designed to kill them or stop their growth, is a growing public health threat worldwide, with recent estimates of over 1.27 million deaths directly attributable to it [1].

The emergence of high-throughput sequencing, used in conjunction with computational methods such as bioinformatics, mathematical or statistical modelling, and machine learning, holds the promise of improving surveillance, enabling more accurate diagnosis, and informing treatment decisions within the infectious disease realm [2].

However, the successful use of next-generation sequencing (NGS) and accompanying computational methods, especially those based on machine learning, for the control of AMR is critically reliant on the availability of large-scale, high-quality data that details both genotypes (data on the microbe’s genome) as well as phenotypes (antimicrobial susceptibility tests, or AST) for individual isolates.

Unfortunately, although microbial data are widely considered as falling outside patient privacy considerations [3], barriers remain for collecting, sharing, and using this type of data, which result from the variety of options and standards for publishing genotypic and phenotypic data.

We propose 10 simple rules for the effective collecting and sharing of genotype–phenotype AMR data within a publication so as to make it easy-to-use for downstream studies involving computational analysis and the robust identification of genomic determinants of AMR. We recognise that sharing this type of data is not always a primary motivation for collecting it; however, the benefits of sharing data for both the research group and the field as a whole are well documented [4,5], while the additional effort required can be limited to a small upfront investment. We emphasise that, while following these recommendations will make downstream analysis and data integration easier, the most important contribution from a group collecting such data is to make it fully available via open-access journals, databases, or repositories.

Rule 1: Decide on a well-defined format; provide all data in this format

The lack of data standardisation is a key barrier to useful inference that can inform AMR control [6]. A number of standard formats (such as those by GMI [7], NCBI [8], and WHONET [9]) and guidelines (such as those by the GSC [10] and PHA4GE [11]) already exist for publishing genotype and phenotype data, as well as metadata (relevant information about the isolate, such as the date, location of isolation, and the source of the sample). While the community has yet to reach agreement on which of these formats should be adopted, each group or research consortium should internally agree on a specific format so that, at a minimum, all the data from the group can be easily compared, combined, and analysed together.

Each of the fields in Box 1 above should appear in its own column, with some fields potentially subdivided into several columns; for example, genus and species. Further, whenever abbreviations or other ambiguous notation is used, a separate data dictionary should be provided to enable a disambiguation to take place in downstream analysis. Note that the last bullet point means that the common approach consisting of reporting either “Sensitive”or listing the drugs to which the isolate is resistant should be avoided.

Box 1. Recommended format for reporting genotype and AMR phenotype data

Based on a review of the available formats, we recommend the use of a single file in a tabular format, with 1 row per isolate, and the following information made available for each one:

Internal ID (this can be helpful as the key for merging genotype and phenotype tables)
Accession number for the raw genotypic data in databases (NCBI, ENA, and DDBJ [12])
Additional accession numbers specific to the isolate, such as the assembled contigs
Collection date, in a “long format” (e.g., 12 October 2022) to avoid potential confusion
Collection location, ideally in an unambiguous format such as longitude and latitude
Source of isolation (animal, clinical, environmental, etc.)
For clinical isolates, the fluid or tissue the isolate is from (blood, sputum, stool, urine, etc.)
Isolate genus and species
Experimental approach used to measure phenotypic susceptibility (agar dilution, Etest, Vitek2, etc. [13])
For each drug or combination tested for susceptibility, ideally the 3 columns specified in Rule 4, otherwise 1 column with the resistance status (susceptible (S), intermediate susceptibility (I), resistant (R))

While the adoption of a uniform standard may seem onerous, our experience suggests that this pays long-term dividends in the form of easier sharing, less potential for confusion in downstream analysis, and a greater incentive for other groups to use the data, which generates additional credit in exchange of only minimal additional effort.

Rule 2: Provide relevant contextual sample metadata

Include available relevant clinical information on the samples, such as whether bacterial samples were taken from blood, urine, the environment, etc, as for certain microbes the interpretation of MICs or drugs tested will depend on the source of the isolate. Date and location of collection are also important, with latitude and longitude being pivotal for visualising and contextualising the location of different AMR determinants, patterns, and high-risk strains, as shown on a small example dataset in Fig 1 below.

Download:

Fig 1. Microreact visualisation.

An example of the informative use of location and sample collection date for data contextualisation. Left: Map. Right: Phylogenetic tree. Bottom: Timeline. Microreact showcase, Global Staphylococcus aureus ST239 [14,15].

https://doi.org/10.1371/journal.pcbi.1011129.g001

Rule 3: Make all samples identifiable, including those from externally sourced studies

When documenting genotypic and phenotypic data on isolates from other studies, provide individual accession numbers for each sample in the metadata table, alongside an overall reference for each study included. This allows more streamlined retrieval of data and prevents mismatching of samples due to incompatible IDs used in publication metadata and online sequence data repositories, as well as correctly attributes credit to the data producers.

Rule 4: Provide raw quantitative data for phenotypic AST results

A number of factors can impact the quality and trustworthiness of AST [16], such as the depth and composition of media, spacing and potency of antibiotic sources, and incubation time and temperature. Internal and external quality control (QC) of AST should ideally be performed [16] to ensure that phenotyping results are consistent with those obtained in gold-standard laboratories.

The correct interpretation of a minimum inhibitory concentration (MIC) involves factors such as the pharmacokinetics of an antimicrobial and the body site of an infection. As a result, the categorisation [17] of quantitative MIC data as susceptible (S), intermediate (I), or resistant (R) can depend on the infection context. Furthermore, intermediate results may correspond to both a combination of strains that can be successfully treated with an increased dosage of the antibiotic as well as cases falling into a well-known buffer zone [17] (also known as an area of technical uncertainty, or ATU [18]) in which results from a test should be interpreted with caution. Breakpoints for categorising isolates also change across continents (e.g., EUCAST [19] versus CLSI [20]) as well as time, meaning that categorical assignments may not be comparable between Europe and North America, or across different years. To ensure published data can be integrated with other sources, researchers should specify the breakpoints used to categorise isolates as S, I, or R, and provide the raw quantitative measurements (such as minimum inhibitory concentration, disk diffusion zone, or zone of inhibition size) in the following 3 columns:

Measurement value
Measurement sign (= for exact, < if below the minimum, or > if above the maximum)
Measurement unit (typically, mg/L for concentrations or mm for diameters)

Different studies may also test a different range of MIC values, making comparison of results challenging. To account for these reporting differences, maximum and minimum MICs tested for each antibiotic should be reported explicitly, either as an additional column or in a separate table. This would allow other researchers to appropriately preprocess their results and compare accuracy within the range covered by both studies. Lastly, the community working on AMR would greatly benefit from the genotype–phenotype data being reported for a standardised panel of antibiotics specified for the bacterial pathogen being studied. As many studies are opportunistic and rely on clinically generated MIC data, this may not always be possible. However, only testing all the isolates of a particular pathogen in a study against the same panel of antibiotics can guarantee to provide a complete, internally comparable dataset.

Rule 5: Include the phenotyping method

The phenotyping method used should be included as a metadata column in the same data source as the phenotypic data. Each phenotyping method has differing strengths and weaknesses [17] in terms of flexibility in antibiotics and concentrations tested, error rates, scalability, and affordability. Methods may differ in accuracy for specific microbe–drug combinations and taking this into account is important for comparative studies and evaluating discrepancies between genotype and phenotype. For example, Vitek2 is associated with higher error rates when measuring cefepime resistance in ESBL-producing Escherichia coli [21], and a range of phenotyping methods have been found to have high error rates compared to agar dilution when measuring fosfomycin resistance in ESBL-producing E. coli [22]. Knowing exactly which method has been used as an additional column in the supplementary material or metadata is helpful, especially when samples are processed with different methods.

Rule 6: Share tabular data files in machine-readable format

It is best to submit supplementary files with raw data and metadata in a format that can be easily read and combined with other data. Of all such formats, the 2 most easily accessible ones are tab-separated values (.tsv) and comma-separated values (.csv), with multiple open-source tools supporting their use without any licensing requirements. The choice between.tsv and.csv is largely a matter of preference. The.csv format requires careful handling when a comma is also used as a separator within a field (e.g., longitude, latitude in the location data), as well as in some locales and operating systems that use semicolons rather than commas as separators. On the other hand, the.csv format is currently accepted by a larger number of journals than the.tsv format.

Microsoft Excel formats (.xls or.xlsx), while convenient, create several additional challenges caused by automated conversions. Sample IDs such as 8E12 may be interpreted as numbers in scientific notation; some gene names may be interpreted as dates (although this problem is largely limited to human genes) [23], and bacterial datasets may exceed the size limit in Excel [24].

Image formats such as.png or.jpg and the portable data format (.pdf) are the least appropriate for sharing data. While phylogenies annotated with isolate IDs and AST results at the tips are common in publications describing AMR genotype and phenotype data, they are challenging to extract into a machine-readable format despite the advances in optical character recognition (OCR) technology [25]. If such a phylogeny is included as an image, the tip data and metadata should also be included as a tabular file, or the phylogeny itself provided in a machine-readable format (e.g., Newick [26]).

The file(s) containing the data in tabular format can either be included as supplementary files or, especially for large files, deposited in an openly accessible repository that provides a permanent digital object identifier (DOI), such as figshare, Synapse, or Zenodo. Alternatively, a project can be created on the Microreact platform [27] (also see Fig 1), enabling both an easy visualisation of the data’s accompanying metadata as well as access to the underlying data in tabular format.

Rule 7: Make raw genomic data available

Submit raw reads, not just assemblies, whenever possible, and when submitting data to common databases such as NCBI, ENA, or DDBJ [12], provide experiment numbers, not only accession numbers, in the data table. Specify the machine and software used for base calling, including settings, and specify what, if any, QC steps were taken to process the raw data. When reporting data on isolates included in the study, include isolates that were sequenced and subsequently excluded by QC, with a column distinguishing isolates that were included and those that were not. An ongoing effort by the Data Structures Working Group within the Public Health Alliance for Genomic Epidemiology (PHA4GE) has resulted in the creation of a flexible yet precise ontology of flags that can be used to identify QC issues [28].

The inclusion of all genotypic data allows recovery of excluded isolates in later studies that may use different QC criteria, as well as improved calibration of analytical tools for sequencing data. As some sequencing platforms such as Oxford Nanopore (ONT) refine their chemistries and base calling algorithms, providing traces and chemistry information can also allow better assessment of the data and creates the potential for subsequent correction of biases introduced through different base calling methodologies, thus facilitating the inclusion of older data in an analysis.

Rule 8: Make genotypic resistance calls in a reproducible manner

When reporting the presence of resistance determinants, these genotypic calls should be easily reproducible by using the same workflow. A popular option for annotating resistance genotypes is via an open database of resistance determinants, e.g., CARD [29] for bacteria or MARDy [30] for fungi—see Table 1 in Kaprou and colleagues [31]. As some of these are regularly updated, the name and version of the database must be specified. Furthermore, AMR can be facilitated by the expression of an acquired AMR gene or by a chromosomal mutation in an antibiotic target [32]. It is thus important to specify the type of AMR determinant involved when possible.

Effort should also be made to use established nomenclature to describe the resistance gene, such as the NCBI’s formalised process for naming beta-lactamase genes [33] or CARD’s Antibiotic Resistance Ontology and Short Names conventions [29]; this will ensure that the report is easily findable in relevant searches. The hAMRonization tool [34], published by PHA4GE, can also assist the data producers in annotating the identified genes in a systematic way. If genotype calls were made using a custom genomic determinant database, the database and the analysis code should also be supplied in a format that can easily be used by a reader. When AMR genes have been detected by PCR rather than NGS, this should be made clear in the data table, and the primers used should be provided explicitly, as PCR may miss variants of a gene or incorrectly conflate variants of a gene [35].

Rule 9: Report novel resistance determinants in a systematic way

The identification of new resistance determinants such as SNPs, indels, copy numbers, or alleles of specific genes [29] can be a valuable tool for tracking the de novo emergence and spread of resistance mechanisms. But these mechanisms are only detectable in other datasets if, in addition to the variant being described in the main text of a paper, the full relevant sequence information is given in the supplementary materials. Ideally, the determinant’s complete genomic sequence should also be deposited in an open-access database of genetic determinants of AMR and the accession number provided. For example, to meet the current criteria for inclusion into CARD, an AMR determinant must be described in a peer-reviewed scientific publication, its DNA sequence available in GenBank, and there must be clear experimental evidence of elevated MIC over controls [29].

Rule 10: Share the data to the fullest extent possible

While all of the above rules should be followed in an ideal scenario, we recognise that time, cost, and expertise constraints, as well as the time-sensitive nature of data from ongoing or emerging outbreaks, or of metadata that contains protected clinical or public health information, may create substantial practical barriers to their implementation. For this reason, we recommend that researchers share their genotype and phenotype AMR data and metadata to the full extent they can do so without adverse consequences, in the format and database of their choice, because the only thing worse than having data in an unusual format or with inconsistent annotations is not having any data at all.

Thankfully, a variety of initiatives are underway to facilitate AMR data sharing, from software solutions such as WHONET [9] and Microreact [27], to community efforts such as the Pathogenwatch resource for Neisseria gonorrhoeae [36]. The experience of the GISAID platform during the COVID-19 pandemic, with over 5 million genotypes deposited with accompanying metadata in less than 2 years [37], suggests that scalable pathogen genomic data sharing is possible, and it is our hope that the field of AMR will take inspiration from this building momentum.

References

1. Murray CJ, Ikuta KS, Sharara F, Swetschinski L, Robles Aguilar G, Gray A, et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022;399(10325):629–655. pmid:35065702
- View Article
- PubMed/NCBI
- Google Scholar
2. Chindelevitch L, Jauneikaite E, Wheeler NE, Allel K, Ansiri-Asafoakaa BY, Awuah WA, et al. Applying data technologies to combat AMR: current status, challenges, and opportunities on the way forward; 2022. Available from: https://arxiv.org/abs/2208.04683.
- View Article
- Google Scholar
3. Ribeiro CS, van Roode MY, Haringhuizen GB, Koopmans MP, Claassen E, van de Burgwal LHM. How ownership rights over microorganisms affect infectious disease control and innovation: A root-cause analysis of barriers to data sharing as experienced by key stakeholders. PLoS ONE. 2018;13 (5):e0195885. pmid:29718947
- View Article
- PubMed/NCBI
- Google Scholar
4. Cochrane G, Lauer K, Blomberg N, Apweiler R, Birney E. Pathogen genomics data sharing: public health meets research; 2022.
- View Article
- Google Scholar
5. Timme RE, Wolfgang WJ, Balkey M, Venkata SLG, Randolph R, Allard M, et al. Optimizing open data to support one health: best practices to ensure interoperability of genomic data from bacterial pathogens. One Health. Outlook. 2020;2(1). pmid:33103064
- View Article
- PubMed/NCBI
- Google Scholar
6. Pettengill JB, Beal J, Balkey M, Allard M, Rand H, Timme R. Interpretative Labor and the Bane of Nonstandardized Metadata in Public Health Surveillance and Food Safety. Clin Infect Dis. 2021;73(8):1537–1539. pmid:34240118
- View Article
- PubMed/NCBI
- Google Scholar
7. Wielinga PR, Hendriksen RS, Aarestrup FM, Lund O, Smits SL, Koopmans MPG, et al. Global Microbial Identifier. Applied Genomics of Foodborne Pathogens. 2017:13–31.
- View Article
- Google Scholar
8. NCBI. BioSample Antibiograms. 2022. Available from: https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/.
9. World Health Organization. WHONET 5: microbiology laboratory database software. 1999.
10. Field D, Sterk P, Kottmann R, De Smet JW, Amaral-Zettler L, Cochrane G, et al. Genomic Standards Consortium Projects. Stand Genomic Sci. 2014;9(3):599–601. pmid:25197446
- View Article
- PubMed/NCBI
- Google Scholar
11. Timme R. Guidance for populating GenomeTrakr metadata templates (BioSample and SRA) v2. 2022. Available from:
- View Article
- Google Scholar
12. Blaxter M, Danchin A, Savakis B, Fukami-Kobayashi K, Kurokawa K, Sugano S, et al. Reminder to deposit DNA sequences. Science. 2016;352(6287):780–780. pmid:27169596
- View Article
- PubMed/NCBI
- Google Scholar
13. Marroki A, Bousmaha-Marroki L. Antibiotic Resistance Diagnostic Methods for Pathogenic Bacteria. Encyclopedia of Infect Immun. 2022:320–341.
- View Article
- Google Scholar
14. Microreact. Global Staphylococcus aureus ST239. 2022. Available from: https://microreact.org/project/NJ-zAij8.
15. Harris SR, Feil EJ, Holden MTG, Quail MA, Nickerson EK, Chantratita N, et al. Evolution of MRSA During Hospital Transmission and Intercontinental Spread. Science. 2010;327(5964):469–474. pmid:20093474
- View Article
- PubMed/NCBI
- Google Scholar
16. Gajic I, Kabic J, Kekic D, Jovicevic M, Milenkovic M, Mitic Culafic D, et al. Antimicrobial Susceptibility Testing: A Comprehensive Review of Currently Used Methods. Antibiotics. 2022;11(4):427. pmid:35453179
- View Article
- PubMed/NCBI
- Google Scholar
17. Jorgensen J, Ferraro M. Antimicrobial Susceptibility Testing: A Review of General Principles and Contemporary Practices. Clin Infect Dis. 2009;49(11):1749–1755. pmid:19857164
- View Article
- PubMed/NCBI
- Google Scholar
18. Soares A, Pestel-Caron M, Leysour de Rohello F, Bourgoin G, Boyer S, Caron F. Area of technical uncertainty for susceptibility testing of amoxicillin/clavulanate against Escherichia coli: analysis of automated system, Etest and disk diffusion methods compared to the broth microdilution reference. Clin Microbiol Infect. 2020;26(12):1685.e1–1685.e6.
- View Article
- Google Scholar
19. Brown D, Cantón R, Dubreuil L, Gatermann S, Giske C, MacGowan A, et al. Widespread implementation of EUCAST breakpoints for antibacterial susceptibility testing in Europe. Eurosurveillance. 2015;20(2). pmid:25613780
- View Article
- PubMed/NCBI
- Google Scholar
20. Weinstein MP. Performance Standards for Antimicrobial Susceptibility Testing. Clinical and Laboratory Standards Institute; 2022.
21. Rhodes NJ, Richardson CL, Heraty R, Liu J, Malczynski M, Qi C, et al. Unacceptably High Error Rates in Vitek 2 Testing of Cefepime Susceptibility in Extended-Spectrum-β-Lactamase-Producing Escherichia coli. Antimicrob Agents Chemother. 2014;58(7):3757–3761.
- View Article
- Google Scholar
22. van den Bijllaardt W, Schijffelen MJ, Bosboom RW, Cohen Stuart J, Diederen B, Kampinga G, et al. Susceptibility of ESBL Escherichia coli and Klebsiella pneumoniae to fosfomycin in the Netherlands and comparison of several testing methods including Etest, MIC test strip, Vitek2, Phoenix and disc diffusion. J Antimicrob Chemother. 2018;73(9):2380–2387. pmid:29982660
- View Article
- PubMed/NCBI
- Google Scholar
23. Lewis D. Autocorrect errors in Excel still creating genomics headache. Nature. 2021. pmid:34389840
- View Article
- PubMed/NCBI
- Google Scholar
24. Lariviere D, Mei H, Freeberg M, Taylor J, Nekrutenko A. Understanding trivial challenges of microbial genomics: An assembly example. 2018. Available from: https://www.biorxiv.org/content/10.1101/347625v1.
- View Article
- Google Scholar
25. Huang J, Pang G, Kovvuri R, Toh M, Liang KJ, Krishnan P, et al. A Multiplexed Network for End-to-End, Multilingual OCR. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021.
- View Article
- Google Scholar
26. Cardona G, Rosselló F, Valiente G. Extended Newick: it is time for a standard representation of phylogenetic networks. BMC Bioinformatics. 2008;9(1). pmid:19077301
- View Article
- PubMed/NCBI
- Google Scholar
27. Argimón S, Abudahab K, Goater RJE, Fedosejev A, Bhai J, Glasner C, et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microbial. Genomics. 2016;2(11). pmid:28348833
- View Article
- PubMed/NCBI
- Google Scholar
28. PHA4GE. Contextual data quality control tags. 2022. Available from: https://github.com/pha4ge/contextual-data-QC-tags.
29. Alcock BP, Huynh W, Chalil R, Smith KW, Raphenya A, Wlodarski MA, et al. CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database. Nucleic Acids Res. 2022;51:D690–D699.
- View Article
- Google Scholar
30. Nash A, Sewell T, Farrer RA, Abdolrasouli A, Shelton JMG, Fisher MC, et al. MARDy: Mycology Antifungal Resistance Database. Bioinformatics. 2018;34(18):3233–3234. pmid:29897419
- View Article
- PubMed/NCBI
- Google Scholar
31. Kaprou GD, Bergšpica I, Alexa EA, Alvarez-Ordóñez A, Prieto M. Rapid Methods for Antimicrobial Resistance Diagnostics. Antibiotics. 2021;10(2):209. pmid:33672677
- View Article
- PubMed/NCBI
- Google Scholar
32. Munita JM, Arias CA. Mechanisms of Antibiotic Resistance. Microbiology. Spectrum. 2016;4:2. pmid:27227291
- View Article
- PubMed/NCBI
- Google Scholar
33. Bradford PA, Bonomo RA, Bush K, Carattoli A, Feldgarden M, Haft DH, et al. Consensus on β-Lactamase Nomenclature. Antimicrob Agents Chemother. 2022;66(4):e0033322.
- View Article
- Google Scholar
34. PHA4GE. hAMRonization AMR detection specification scheme. 2022. Available from: https://github.com/pha4ge/hAMRonization.
35. Smiljanic M, Kaase M, Ahmad-Nejad P, Ghebremedhin B. Comparison of in-house and commercial real time-PCR based carbapenemase gene detection methods in Enterobacteriaceae and non-fermenting gram-negative bacterial isolates. Ann Clin Microbiol Antimicrob. 2017;16(1). pmid:28693493
- View Article
- PubMed/NCBI
- Google Scholar
36. Sánchez-Busó L, Yeats CA, Taylor B, Goater RJ, Underwood A, Abudahab K, et al. A community-driven resource for genomic epidemiology and antimicrobial resistance prediction of Neisseria gonorrhoeae at Pathogenwatch. Genome Med. 2021;13(1). pmid:33875000
- View Article
- PubMed/NCBI
- Google Scholar
37. Focosi D, Maggi F, McConnell S, Casadevall A. Very low levels of remdesivir resistance in SARS-COV-2 genomes after 18 months of massive usage during the COVID19 pandemic: A GISAID exploratory analysis. Antiviral Res. 2022;198:105247. pmid:35033572
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Murray CJ, Ikuta KS, Sharara F, Swetschinski L, Robles Aguilar G, Gray A, et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet. 2022;399(10325):629–655. pmid:35065702
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Chindelevitch L, Jauneikaite E, Wheeler NE, Allel K, Ansiri-Asafoakaa BY, Awuah WA, et al. Applying data technologies to combat AMR: current status, challenges, and opportunities on the way forward; 2022. Available from: https://arxiv.org/abs/2208.04683.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref3] 3. Ribeiro CS, van Roode MY, Haringhuizen GB, Koopmans MP, Claassen E, van de Burgwal LHM. How ownership rights over microorganisms affect infectious disease control and innovation: A root-cause analysis of barriers to data sharing as experienced by key stakeholders. PLoS ONE. 2018;13 (5):e0195885. pmid:29718947
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref4] 4. Cochrane G, Lauer K, Blomberg N, Apweiler R, Birney E. Pathogen genomics data sharing: public health meets research; 2022.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref5] 5. Timme RE, Wolfgang WJ, Balkey M, Venkata SLG, Randolph R, Allard M, et al. Optimizing open data to support one health: best practices to ensure interoperability of genomic data from bacterial pathogens. One Health. Outlook. 2020;2(1). pmid:33103064
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Pettengill JB, Beal J, Balkey M, Allard M, Rand H, Timme R. Interpretative Labor and the Bane of Nonstandardized Metadata in Public Health Surveillance and Food Safety. Clin Infect Dis. 2021;73(8):1537–1539. pmid:34240118
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref7] 7. Wielinga PR, Hendriksen RS, Aarestrup FM, Lund O, Smits SL, Koopmans MPG, et al. Global Microbial Identifier. Applied Genomics of Foodborne Pathogens. 2017:13–31.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref8] 8. NCBI. BioSample Antibiograms. 2022. Available from: https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/.

[ref9] 9. World Health Organization. WHONET 5: microbiology laboratory database software. 1999.

[ref10] 10. Field D, Sterk P, Kottmann R, De Smet JW, Amaral-Zettler L, Cochrane G, et al. Genomic Standards Consortium Projects. Stand Genomic Sci. 2014;9(3):599–601. pmid:25197446
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref11] 11. Timme R. Guidance for populating GenomeTrakr metadata templates (BioSample and SRA) v2. 2022. Available from:
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref12] 12. Blaxter M, Danchin A, Savakis B, Fukami-Kobayashi K, Kurokawa K, Sugano S, et al. Reminder to deposit DNA sequences. Science. 2016;352(6287):780–780. pmid:27169596
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref13] 13. Marroki A, Bousmaha-Marroki L. Antibiotic Resistance Diagnostic Methods for Pathogenic Bacteria. Encyclopedia of Infect Immun. 2022:320–341.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref14] 14. Microreact. Global Staphylococcus aureus ST239. 2022. Available from: https://microreact.org/project/NJ-zAij8.

[ref15] 15. Harris SR, Feil EJ, Holden MTG, Quail MA, Nickerson EK, Chantratita N, et al. Evolution of MRSA During Hospital Transmission and Intercontinental Spread. Science. 2010;327(5964):469–474. pmid:20093474
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref16] 16. Gajic I, Kabic J, Kekic D, Jovicevic M, Milenkovic M, Mitic Culafic D, et al. Antimicrobial Susceptibility Testing: A Comprehensive Review of Currently Used Methods. Antibiotics. 2022;11(4):427. pmid:35453179
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref17] 17. Jorgensen J, Ferraro M. Antimicrobial Susceptibility Testing: A Review of General Principles and Contemporary Practices. Clin Infect Dis. 2009;49(11):1749–1755. pmid:19857164
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref18] 18. Soares A, Pestel-Caron M, Leysour de Rohello F, Bourgoin G, Boyer S, Caron F. Area of technical uncertainty for susceptibility testing of amoxicillin/clavulanate against Escherichia coli: analysis of automated system, Etest and disk diffusion methods compared to the broth microdilution reference. Clin Microbiol Infect. 2020;26(12):1685.e1–1685.e6.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref19] 19. Brown D, Cantón R, Dubreuil L, Gatermann S, Giske C, MacGowan A, et al. Widespread implementation of EUCAST breakpoints for antibacterial susceptibility testing in Europe. Eurosurveillance. 2015;20(2). pmid:25613780
View Article
PubMed/NCBI
Google Scholar

[59] View Article

[60] PubMed/NCBI

[61] Google Scholar

[ref20] 20. Weinstein MP. Performance Standards for Antimicrobial Susceptibility Testing. Clinical and Laboratory Standards Institute; 2022.

[ref21] 21. Rhodes NJ, Richardson CL, Heraty R, Liu J, Malczynski M, Qi C, et al. Unacceptably High Error Rates in Vitek 2 Testing of Cefepime Susceptibility in Extended-Spectrum-β-Lactamase-Producing Escherichia coli. Antimicrob Agents Chemother. 2014;58(7):3757–3761.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref22] 22. van den Bijllaardt W, Schijffelen MJ, Bosboom RW, Cohen Stuart J, Diederen B, Kampinga G, et al. Susceptibility of ESBL Escherichia coli and Klebsiella pneumoniae to fosfomycin in the Netherlands and comparison of several testing methods including Etest, MIC test strip, Vitek2, Phoenix and disc diffusion. J Antimicrob Chemother. 2018;73(9):2380–2387. pmid:29982660
View Article
PubMed/NCBI
Google Scholar

[67] View Article

[68] PubMed/NCBI

[69] Google Scholar

[ref23] 23. Lewis D. Autocorrect errors in Excel still creating genomics headache. Nature. 2021. pmid:34389840
View Article
PubMed/NCBI
Google Scholar

[71] View Article

[72] PubMed/NCBI

[73] Google Scholar

[ref24] 24. Lariviere D, Mei H, Freeberg M, Taylor J, Nekrutenko A. Understanding trivial challenges of microbial genomics: An assembly example. 2018. Available from: https://www.biorxiv.org/content/10.1101/347625v1.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref25] 25. Huang J, Pang G, Kovvuri R, Toh M, Liang KJ, Krishnan P, et al. A Multiplexed Network for End-to-End, Multilingual OCR. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref26] 26. Cardona G, Rosselló F, Valiente G. Extended Newick: it is time for a standard representation of phylogenetic networks. BMC Bioinformatics. 2008;9(1). pmid:19077301
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref27] 27. Argimón S, Abudahab K, Goater RJE, Fedosejev A, Bhai J, Glasner C, et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microbial. Genomics. 2016;2(11). pmid:28348833
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref28] 28. PHA4GE. Contextual data quality control tags. 2022. Available from: https://github.com/pha4ge/contextual-data-QC-tags.

[ref29] 29. Alcock BP, Huynh W, Chalil R, Smith KW, Raphenya A, Wlodarski MA, et al. CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database. Nucleic Acids Res. 2022;51:D690–D699.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref30] 30. Nash A, Sewell T, Farrer RA, Abdolrasouli A, Shelton JMG, Fisher MC, et al. MARDy: Mycology Antifungal Resistance Database. Bioinformatics. 2018;34(18):3233–3234. pmid:29897419
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref31] 31. Kaprou GD, Bergšpica I, Alexa EA, Alvarez-Ordóñez A, Prieto M. Rapid Methods for Antimicrobial Resistance Diagnostics. Antibiotics. 2021;10(2):209. pmid:33672677
View Article
PubMed/NCBI
Google Scholar

[97] View Article

[98] PubMed/NCBI

[99] Google Scholar

[ref32] 32. Munita JM, Arias CA. Mechanisms of Antibiotic Resistance. Microbiology. Spectrum. 2016;4:2. pmid:27227291
View Article
PubMed/NCBI
Google Scholar

[101] View Article

[102] PubMed/NCBI

[103] Google Scholar

[ref33] 33. Bradford PA, Bonomo RA, Bush K, Carattoli A, Feldgarden M, Haft DH, et al. Consensus on β-Lactamase Nomenclature. Antimicrob Agents Chemother. 2022;66(4):e0033322.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref34] 34. PHA4GE. hAMRonization AMR detection specification scheme. 2022. Available from: https://github.com/pha4ge/hAMRonization.

[ref35] 35. Smiljanic M, Kaase M, Ahmad-Nejad P, Ghebremedhin B. Comparison of in-house and commercial real time-PCR based carbapenemase gene detection methods in Enterobacteriaceae and non-fermenting gram-negative bacterial isolates. Ann Clin Microbiol Antimicrob. 2017;16(1). pmid:28693493
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref36] 36. Sánchez-Busó L, Yeats CA, Taylor B, Goater RJ, Underwood A, Abudahab K, et al. A community-driven resource for genomic epidemiology and antimicrobial resistance prediction of Neisseria gonorrhoeae at Pathogenwatch. Genome Med. 2021;13(1). pmid:33875000
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref37] 37. Focosi D, Maggi F, McConnell S, Casadevall A. Very low levels of remdesivir resistance in SARS-COV-2 genomes after 18 months of massive usage during the COVID19 pandemic: A GISAID exploratory analysis. Antiviral Res. 2022;198:105247. pmid:35033572
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

Figures