Figures
Abstract
There is a growing interest in using computer-assisted models for the detection of macular conditions using optical coherence tomography (OCT) data. As the quantity of clinical scan data of specific conditions is limited, these models are typically developed by fine-tuning a generalized network to classify specific macular conditions of interest. Full thickness macular holes (FTMH) present a condition requiring urgent surgical repair to prevent vision loss. Other works on automated FTMH classification have tended to use supervised ImageNet pre-trained networks with good results but leave room for improvement. In this paper, we develop a model for FTMH classification using OCT B-scans around the central foveal region to pre-train a naïve network using contrastive self-supervised learning. We found that self-supervised pre-trained networks outperform ImageNet pre-trained networks despite a small training set size (284 eyes total, 51 FTMH+ eyes, 3 B-scans from each eye). On three replicate data splits, 3D spatial contrast pre-training yields a model with an average F1-score of 1.0 on holdout data (50 eyes total, 10 FTMH+), compared to an average F1-score of 0.831 for FTMH detection by ImageNet pre-trained models. These results demonstrate that even limited data may be applied toward self-supervised pre-training to substantially improve performance for FTMH classification, indicating applicability toward other OCT-based problems.
Author summary
Full thickness macular holes (FTMH) are a sight-threatening condition that involves the fovea, the area of the retina of the eye involved in central vision. Timely diagnosis is paramount because of the risk of permanent vision loss with delayed surgical correction. In clinical practice, FTMH are commonly diagnosed with the aid of optical coherence tomography (OCT) images of the fovea. However, certain conditions such as pseudoholes and epiretinal membranes may complicate the diagnosis of full thickness macular holes on imaging. Here, we employ artificial intelligence and present a machine-learning model that distinguishes full thickness macular hole from conditions that may present similarly upon image review. Despite training our model with a smaller data set, it outperformed traditional models previously seen in other works. We provide a strong framework for a self-supervised pre-trained model that can accurately distinguish full thickness macular holes from epiretinal membranes and pseudoholes. Overall, our study provides evidence of the benefit and efficacy of utilizing artificial intelligence for this image classification task.
Citation: Wheeler TW, Hunter K, Garcia PA, Li H, Thomson AC, Hunter A, et al. (2024) Self-supervised contrastive learning improves machine learning discrimination of full thickness macular holes from epiretinal membranes in retinal OCT scans. PLOS Digit Health 3(8): e0000411. https://doi.org/10.1371/journal.pdig.0000411
Editor: Martin G. Frasch, University of Washington, UNITED STATES OF AMERICA
Received: November 13, 2023; Accepted: July 8, 2024; Published: August 26, 2024
Copyright: © 2024 Wheeler et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data that supports the findings of this study are available in Harvard Dataverse at https://doi.org/10.7910/DVN/X3L0XD. All code for model training and analysis are available in GitHub at https://github.com/twheele3/rascl.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Optical coherence tomography (OCT) allows detailed anatomical imaging via high resolution scans through several hundred microns of retinal tissue layers [1]. The macula is a 5.5 mm wide area of the posterior retina and includes a smaller 1.5 mm wide area known as the fovea which correlates with the eye’s central and highest acuity vision [2,3]. OCT scanning of the macula generates a wealth of data on possible pathologies present within the retina, but to extract clinically relevant information requires subspecialty review. This imposes an increased workload on ophthalmologists, taking time away from higher level decision-making and patient care.
One condition of interest is full thickness macular holes (FTMH), wherein all layers of the neurosensory retina (between the internal limiting membrane and the retinal pigment epithelium (RPE)) in the fovea are disrupted [4]. The vitreous is a clear gel substance which fills the eye’s posterior chamber and has attachments to the macula. Macular holes often develop from vitreous traction at the vitreofoveal interface and are most commonly idiopathic but can result from traumatic injury to the eye or secondary to other ocular pathologies [4,5]. Gass and Johnson were the first to suggest that this phenomenon occurs due to focal shrinking of the vitreous which results in traction on the fovea [4,5]. In Gass’ original classification [4,6], the natural progression of macular holes proceeds through stages associated with worsening visual acuity: foveolar detachment in stage 1, formation of a foveal hole with associated subretinal fluid and cystoid macular edema in stage 2, formation of an FTMH in stage 3, followed by an FTMH with a complete posterior vitreous detachment from the vitreofoveal interface in stage 4. Due to involvement of the fovea, this results in impairment of central visual acuity [4].
A confounding comorbidity to FTMH is epiretinal membrane (ERM), in which a fibrotic contractile plaque grows over the retina that may result in anatomic changes similar in clinical appearance to FTMH (e.g., pseudoholes or lamellar holes). ERMs are sometimes associated with macular edema which further distorts retinal anatomy, complicating FTMH diagnosis [7]. Furthermore, FTMHs and ERMs are both common conditions referred to retina practices and are both treated with pars plana vitrectomy (PPV), where the vitreous is surgically removed, and internal limiting membrane peel (ILMx), which involves removing the most superficial layer along the macula. These procedures eliminate all fovea deformation force(s) so that the retina can re-approximate to a more normal anatomical appearance and function. The clinical demand and common treatments for ERM and FTMH highlight the importance of machine learning classification.
Although the appropriate timing of ILMx surgery for FTMH is debated, there is evidence that extended delay in surgical repair heralds impaired vision [8]. In contrast, many ERM patients can be managed with conservative, non-surgical medical care [7]. The need for accurate and timely diagnosis of FTMH can be critical for restoring vision, prompting a need for timely assessment of OCT scans. Automated classification can significantly reduce referral processing time allowing expeditious patient triage and care [9,10].
Deep learning has emerged as a promising avenue of research for creating automated expert reader models to assess medical scans in-line with the acquisition process. Briefly, deep learning refers to (artificial) neural networks that are many layers deep [11]. Neural networks, in turn, are defined as layers of nonlinear signal processing units (artificial neurons) connected by weights (artificial synapses). The effect of training a neural network is to adjust the weights such that the system approximates a function mapping a given input to a desired output. A multi-layered network of nonlinear processing units can in theory approximate any function to arbitrary precision with sufficient training data [12]. Convolutional neural networks (CNN) are particularly well-suited to image recognition tasks because the shared-weight architecture of convolutional kernels learns translation-covariant visual features automatically [13].
Developing an accurate deep learning model requires large quantities of well-annotated ground truth data, and a training regimen that allows the model to learn discriminative features for the problem at hand. A frequent issue arising in training deep learning models with medical data is low dataset size. This makes it impractical to train a naïve network from scratch and has typically been circumvented by fine-tuning a pre-trained network (transfer learning). This can also be problematic, as pre-training is usually based on images from a non-medical domain, typically ImageNet, which means that features extracted by the network may not be relevant to the specific nuances of the tissue and imaging method.
Self-supervised learning is an alternative to transfer learning for handling low data volume. In this approach, a suitable image representation is learned by performing a pretext task on unlabeled data, which is usually available in larger quantities than labeled data. Various approaches to self-supervised learning have been introduced, many involving an encoder, which is a feedforward neural network that represents an image by a vector (also known as an embedding). Contrastive learning pushes positive pairs closer and negative pairs farther apart in the embedding space. A primary example of this contrastive learning approach is SimCLR [14,15].
The last several years have seen an explosion of deep learning models applied to ophthalmic clinical technologies including OCT and fundus imaging. These applications may be divided into broad areas of classification/diagnosis [16–26], segmentation [27–34], image quality [35], and demographics prediction [36]. The current ophthalmic deep learning models focus primarily on diabetic retinopathy, age-related macular degeneration, retinopathy of prematurity, and glaucoma [9,37,38].
Deep learning for OCT image analysis of FTMH has also received attention lately, with models for classification [24,25], segmentation [31–34], and prognosis of success for FTMH corrective surgery [26,39,40]. A review is also available [41]. Owing to the paucity of labeled FTMH data, the majority of these models are pre-trained on ImageNet, and subsequently fine-tuned on a small amount of labeled FTMH OCT images. This transfer learning scheme was adopted in developing the classification model by Pace et al. [24], which achieves 95% accuracy in distinguishing between normal, Drusen, and FTMH images. Carvalho et al. [25], demonstrated an accuracy of 90.6% for FTMH identification, although modeling specifics are not provided. While these results are good, there is room for improvement.
We employ a pre-training method based on a variant of SimCLR that leverages 3D information, and which is tailored to small datasets with multi-slice imaging modalities such as OCT, described here as Random Slice Contrastive Learning (RaSCL). Our model achieves robust feature recognition in OCT scans for assessing FTMH that can outperform traditional ImageNet pre-trained models.
Materials and methods
Patient data
This Institutional Review Board (IRB) approved single center retrospective study reviews for OCT images of patients prior to PPV and ILMx surgery. Three researchers reviewed the Oregon Eye Consultants, LLC database using the ILMx procedure code (67042). Data used in this study was acquired from patients prior to surgical intervention [42]. OCT images that featured idiopathic FTMH (Gass stage 3 or 4) or ERMs (without macular holes) were included in the working dataset. Patients with non-idiopathic macular holes (traumatic, pseudohole), history of ocular trauma, amblyopia, recent ocular surgery (within three months), severe ocular myopia, diabetic macular retinopathy, and retinal pathology associated with systemic conditions (uncontrolled hypertension) were excluded. Variables including age, gender, lens status (pre- or post- cataract surgery), pre- and post- operative visual acuity, medical and ophthalmic comorbidities, and surgical history for patients who met the inclusion and exclusion criteria were documented and pre-operative OCT B-scans were exported for development of the model. OCT images were reviewed by three trained readers to confirm diagnoses (FTMH (Gass stage 3 or 4) and ERM) and assess image quality.
A B-scan is a horizontal linear OCT scan, producing a 2D image of a macular tissue section. Typically, multiple uniformly spaced B-scans are acquired, with the central scan going through the fovea. B-scans were individually labeled as having FTMH by consensus of three expert readers. Following diagnosis confirmation, the central B-scan for each patient was determined and documented by the same trained readers. The B-scans were exported from Heidelberg Spectralis OCT instruments (Heidelberg, Germany) as TIFF files at 496 × 512 resolution, with pixel size 3.87 × 11.38 μm at the retina. Macular OCT scan protocols available included B-scans at either 243 μm or 121 μm apart, and both protocol types were included.
Test splits
Test sets for model evaluation were assembled from 15% of total eyes, comprising 10 FTMH and 40 control eyes, using only the central B-scan to ensure consistency between eyes with differing numbers of B-scans that show evidence of FTMH. Three random split replicates with disjoint test sets were generated for statistical validation. Within each replicate, self-supervised pre-training and fine-tuning were performed using only the training split. For each replicate, training sets were further divided into 8-fold cross-validation splits.
Data stratification
Data was stratified based primarily on diagnosis (FTMH or ERM). Secondary features of age, sex, and pre-operative vision were used to stratify data by optimizing the mean and standard deviation of each subset relative to the overall dataset mean and standard deviation for each feature (treating sex as ordinal).
Preprocessing
B-scans were cropped and resized to 224 × 224 resolution, and then augmented with random noise, brightness, contrast, cropping, and horizontal flips. Additive noise was generated by sampling a normal distribution with randomly drawn parameters μ ∈ [-0.1, 0.15] and σ2 ∈ [0, 0.2], cropping values to the interval [0, 1], to mimic noise patterns typical of B-scans. Random crops comprised 50–100% of the original scan area.
Architecture
The model architecture consists of a CNN followed by a multi-layered perceptron (MLP) based on the SimCLR approach [14,15]. The CNN uses a naïve ResNet50 framework at standard width [43]. The MLP comprises three fully connected layers of 512 nodes each. The final layer is used as a projection head during pre-training, which is fed to a classifier head during fine-tuning. Models were trained for 800 epochs during pre-training and 500 epochs during fine-tuning and were validated every 10 epochs. An 8-fold training set split was used for cross-validation training and unweighted ensemble averaging.
Training was performed on a workstation with an NVIDIA RTX A6000 GPU, a 24-core Intel Xeon W 3345 CPU, under Ubuntu 20.04, using Python 3.8.10 and TensorFlow 2.9.0. Taken together, pre-training and fine-tuning ran for 7.5 hours per ensemble replicate in this computational framework.
Self-supervised pre-training
Pre-training was performed using contrastive self-supervised learning. Image slices from the same eye constituted positive pairs, while images from unrelated eyes constituted negative pairs. In more detail, for each batch, a B-scan was randomly selected from an eye, then a neighboring slice at a distance up to 2 slices away on either side was randomly selected (uniformly) as a positive pair. Scans between unrelated eyes constituted negative pairs (Fig 1). We refer to our scheme as 3D spatial contrast or Random Slice Contrastive Learning (RaSCL), which is similar in spirit to the SimCLR adaptation in Gomariz, et al. [34]. However, in the latter, positive pair slice distance is distributed normally: , and since their B-scans are separated by 111μm, their scheme is effectively the same as standard SimCLR.
Pairs of scans from the same eye and a different eye are randomly chosen from around the central foveal B-scan within a selection margin. Image pairs are then augmented independently and encoded as embeddings by a CNN (e.g., ResNet). Contrastive learning trains the CNN to push embeddings from positive pairs closer together while pushing negative pairs farther apart. The resulting CNN encoder generates features that are generally discriminative in the domain of the training images.
The model transferred for fine-tuning was selected based on the training epoch with maximal accuracy and minimal loss, i.e., min(loss/accuracy), on the validation set. Pre-training data included three B-scans per eye centered on the fovea.
Supervised fine-tuning
Fine-tuning was performed using 8-fold cross-validation. Splits were stratified by diagnosis, age, sex, and visual acuity. CNN weights were fixed during fine-tuning, with only the MLP tuned using a binary cross-entropy loss function, with the two classes specified as control and FTMH. The best model per split was selected by the validation epoch with minimum validation loss divided by validation accuracy. The 8-fold split model outputs were combined as an averaged ensemble with equal weights between components. Fine-tuning data included up to three B-scans per eye centered on the fovea.
ImageNet pre-trained model
A ResNet50 model pretrained on ImageNet was loaded from the Keras library with weights kept trainable during fine-tuning. A MLP head was attached, comprised of three fully connected layers with 512 nodes per layer. Fine-tuning was performed and evaluated using 8-fold cross-validation as described above.
Challenge data set
After the model was trained to distinguish FTMH from control (i.e., ERM), the CNN was challenged by OCT scans from a separate patient group who met the inclusion and exclusion criteria and were diagnosed with pseudohole/lamellar holes (n = 34). This diagnosis was confirmed by the same three trained readers who distinguished the FTMH and ERM diagnosis in the pre-training dataset. The central B-scan for each patient was determined by the readers, and the central B-scans were classified by both the ImageNet and RaSCL models as FTMH positive or negative. The challenge dataset was used to assess the generalization of the model to an unseen condition and thus, patient demographics were not considered.
Results
Patient characteristics
The working dataset contains OCT images of 61 eyes from 60 patients with FTMH (46 Females, 15 Males, ages 52–84 years, mean: 69.6 years, SD: 6.4 years). The remainder of FTMH-negative control data consisted of scans from patients diagnosed with ERM, comprising 274 eyes from 264 patients (140 female, 134 male, ages 23–93 years, mean: 70.5 years, SD: 8.6 years).
A summary of baseline demographic characteristics, lens status and IOP at time of image acquisition for FTMH and ERM patients is available in Table 1. Each characteristic is well distributed within each group with no significant difference. The only significance was seen for the female population of the FTMH eyes (p < 0.05).
Performance
Ensemble models were evaluated on the holdout test set comprising 10 FTMH B-scans and 40 control B-scans, performed in triplicate for different train-test replicates (Fig 2, Table 2).
(A) Receiver operating characteristic (ROC) curves with associated area under curve (B). F1 score for FTMH classification (C). F1 score for control classification (D). Bars indicate mean statistics for each group, and dots indicate individual replicates.
Gradient visualization
Visualization of gradient activation for an FTMH input (Fig 3A) shows that for the RaSCL model, strong activation is localized around the macular hole (Fig 3A’). In contrast, the ImageNet model presents a weaker saliency around the middle of the B-scan (Fig 3A"). Gradient activation for a control input (Fig 3B) shows that the RaSCL model highlights the inner limiting membrane on the upper surface of the macula, and the undisrupted retinal pigment epithelium below (Fig 3B’). ImageNet similarly presents weak saliency for this scan (Fig 3B”).
(A) Gradient visualization of the FTMH output unit using an FTMH B-scan. (B) Gradient visualization of the control output unit using a control B-scan.
Challenge images
A challenge dataset was assembled comprising 34 images presenting lamellar holes, which are partial-thickness defects that can appear similar to FTMH, but do not extend through all the neuronal layers to the RPE. Most lamellar holes do not require intervention. Pseudoholes are another example of a visually similar diagnosis without the need for clinical intervention. We challenged all replicates of each model type and found that all replicates of the RaSCL pre-trained models correctly classified all the challenge data, while the ImageNet pre-trained models misclassified several images, with a mean accuracy of 0.971, p = 0.158. We visualized the FTMH gradients to explore possible explanatory factors for the misclassifications (Fig 4B–4D). While RaSCL FTMH gradients had strong activation around the pseudohole (Fig 4C), this activation did not extend down to the RPE and did not lead to misclassification. By contrast, ImageNet FTMH gradients were diffused around the tissue, suggesting poor saliency from the domain transfer model (Fig 4D).
(A) Classification accuracy. Bars indicate mean and dots indicate replicates. (B) Challenge image with a lamellar hole. (C) Image correctly classified by the RaSCL model overlaid with FTMH gradients. (D) Image incorrectly classified by ImageNet pre-trained models with overlaid FTMH gradients.
Discussion
The use of artificial intelligence (AI) in ophthalmology has advanced significantly, owing to the many image-based investigations in the field. This work demonstrates a potentially strong diagnostic model for full-thickness macular holes, which are a serious eye care concern due the significant vision impairment resulting from the characteristic defect in the central fovea and subsequent loss of central visual acuity [4]. Previous attempts on automated FTMH classification have tended to use supervised ImageNet pre-trained networks with good results but which leave room for improvement. The present work demonstrates that near perfect accuracy is achievable for FTMH classification, although the small size of the dataset justifies only modest conclusions about model generalizability on larger datasets.
There are three options when attempting to train deep learning models on small datasets. (i) The first option is transfer learning: to pre-train the model on data from an unrelated domain, typically ImageNet. We have shown that this produces models with inferior performance and ambiguous interpretability. (ii) The second option is self-supervised learning: to pre-train the model using self-derived labels and contrastive learning on data from the same domain. This is the method of the present article, which we have shown leads to strong models with good performance and interpretability, even when the dataset used for self-supervised learning is modest in size. (iii) The third option is to fine-tune a foundation model that is trained on large and diverse dataset in the domain of interest. A retina-specific foundation model was published very recently [44], at around the same time that the present manuscript was in the final stages of submission. The present authors believe that this third option of fine-tuning a foundation model holds promise, and we leave it to future work to do a comparison between all three options.
The method introduced here, namely random-slice spatial contrastive learning, allows the network to develop a good image representation learned from the data, which provides features that are strongly discriminative for the downstream task of recognizing FTMH. This work demonstrates that strong purpose-specific pre-training is viable for small dataset sizes, and that this method improves performance over transfer learning models. Visualization of gradient activation (Figs 3 and 4) shows sharp saliency in the RaSCL model. The lower discriminative capacity and poor saliency of transfer learning models may also result in brittle classifiers with inferior generalization.
The CNN models in this study present a good starting point for fine-tuning models addressing other OCT-related diagnostic and prognostic questions, in ophthalmology or other fields of inquiry. Prior works addressed whether the outcome of FTMH-corrective surgery can be predicted, with only moderate success [26,39,40]. A natural step would be to use the RaSCL approach to develop models for accurate surgical prognosis, which could have clinical benefit by helping to avoid unnecessary surgeries. This study highlights spatial contrastive learning as a powerful pre-training approach for medical scans, which can be applied to deep learning models for other imaging modalities to enhance their tractability.
Prior to clinical roll-out, further considerations need to be addressed. This data is limited to one retina practice with four retina surgeons with a predominantly Caucasian patient population, restricted to one device type, and has a narrow set of inclusion criteria. Additionally, our data is limited to patients who have undergone ILMx which may introduce selection bias in our dataset as this excludes patients who may have elected against surgical intervention or patients lost to follow up. To develop higher confidence in the model for deployment, other devices (e.g., Zeiss Cirrus, and Topcon), a patient population with a broader demographic profile, a balanced sex distribution and disease-agnostic OCT data would be required. Another frequent issue in training deep learning systems with clinical data is low dataset size. Datasets ranging in the low hundreds of patients are common. This makes it impractical to train a naïve network to a specific problem directly and has typically been circumvented by fine tuning a pretrained network. This can be problematic, as the pretraining is usually based on images from a different domain, which means that large portions of the network are irrelevant to the specific nuances of the tissue/imaging method.
We can apply the framework presented here to an expanded dataset to address these limitations. We included challenge data showing performance on lamellar holes (Fig 4) which is a condition that could be conflated with FTMH. Results on the challenge dataset consisting of lamellar/pseudoholes corroborate the robustness of contrastive pre-training to yield improved performance over non-medical domain pre-trained networks.
The findings in this study illustrate that deep learning algorithms can be used for computer-assisted screening of FTMH in optometry and primary care settings, promoting appropriate and timely referrals to retinal specialists. Efficient patient triage streamlines clinical workflow, reduces clinician workload, and expedites referrals so that patients get access to care sooner, improving overall outcomes.
References
- 1.
Aumann S, Donner S, Fischer J, Müller F. Optical Coherence Tomography (OCT): Principle and Technical Realization. In: Bille JF, editor. High Resolution Imaging in Microscopy and Ophthalmology: New Frontiers in Biomedical Optics. Cham: Springer International Publishing; 2019. p. 59–85.
- 2. Rehman I, Mahabadi N, Ali T. Anatomy, Head and Neck, Eye Ciliary Muscles. StatPearls [Internet]. 2023 Aug 8 [cited 2024 Apr 5]; Available from: https://www.statpearls.com/point-of-care/19556
- 3.
Kolb H, Nelson RF, Ahnelt PK, Ortuno-Lizaran I, Cuenca N. The Architecture of the Human Fovea. In: Kolb H, Fernandez E, Nelson R, editors. Webvision: The Organization of the Retina and Visual System. Salt Lake City (UT)1995.
- 4. Bikbova G, Oshitari T, Baba T, Yamamoto S, Mori K. Pathogenesis and Management of Macular Hole: Review of Current Advances. J Ophthalmol. 2019;2019:3467381.
- 5. Ezra E. Idiopathic full thickness macular hole: natural history and pathogenesis. Br J Ophthalmol. 2001;85(1):102–8. pmid:11133724
- 6. Gass JD. Reappraisal of biomicroscopic classification of stages of development of a macular hole. Am J Ophthalmol. 1995;119(6):752–9. pmid:7785690
- 7. Fung AT, Galvin J, Tran T. Epiretinal membrane: A review. Clin Exp Ophthalmol. 2021;49(3):289–308.
- 8. Murphy DC, Al-Zubaidy M, Lois N, Scott N, Steel DH, Macular Hole Duration Study G. The Effect of Macular Hole Duration on Surgical Outcomes: An Individual Participant Data Study of Randomized Controlled Trials. Ophthalmology. 2023;130(2):152–63. pmid:36058348
- 9. Keskinbora K, Guven F. Artificial Intelligence and Ophthalmology. Turk J Ophthalmol. 2020;50(1):37–43. pmid:32167262
- 10. Li JO, Liu H, Ting DSJ, Jeon S, Chan RVP, Kim JE, et al. Digital technology, tele-medicine and artificial intelligence in ophthalmology: A global perspective. Prog Retin Eye Res. 2021;82:100900. pmid:32898686
- 11. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. pmid:26017442
- 12. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural networks. 1989;2(5):359–66.
- 13. Fukushima K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics. 1980;36(4):193–202. pmid:7370364
- 14. Chen T, Kornblith S, Norouzi M, Hinton G, editors. A simple framework for contrastive learning of visual representations. International conference on machine learning; 2020: PMLR.
- 15. Chen T, Kornblith S, Swersky K, Norouzi M, Hinton GE. Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems. 2020;33:22243–55.
- 16. Li F, Chen H, Liu Z, Zhang XD, Jiang MS, Wu ZZ, et al. Deep learning-based automated detection of retinal diseases using optical coherence tomography images. Biomed Opt Express. 2019;10(12):6204–26.
- 17. Lu W, Tong Y, Yu Y, Xing Y, Chen C, Shen Y. Deep Learning-Based Automated Classification of Multi-Categorical Abnormalities From Optical Coherence Tomography Images. Transl Vis Sci Technol. 2018;7(6):41. pmid:30619661
- 18. Perdomo O, Otálora S, González F, Mériaudeau F, Müller H. OCT-NET: A convolutional network for automatic classification of normal and diabetic macular edema using sd-oct volumes. 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). Washington 2018. p. 1423–6.
- 19. Asaoka R, Murata H, Hirasawa K, Fujino Y, Matsuura M, Miki A, et al. Using Deep Learning and Transfer Learning to Accurately Diagnose Early-Onset Glaucoma From Macular Optical Coherence Tomography Images. Am J Ophthalmol. 2019;198:136–45. pmid:30316669
- 20. Li XC, Shen LL, Shen MX, Tan F Q. Deep learning based early stage diabetic retinopathy detection using optical coherence tomography. Neurocomputing. 369 2019. p. 134–44.
- 21. Kermany DS, Goldbaum M, Cai W, Valentim CCS, Liang H, Baxter SL, et al. Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. Cell. 2018;172(5):1122–31 e9.
- 22. Saha S, Nassisi M, Wang M, Lindenberg S, Kanagasingam Y, Sadda S, et al. Automated detection and classification of early AMD biomarkers using deep learning. Sci Rep. 2019;9(1):10990.
- 23. Kuwayama S, Ayatsuka Y, Yanagisono D, Uta T, Usui H, Kato A, et al. Automated Detection of Macular Diseases by Optical Coherence Tomography and Artificial Intelligence Machine Learning of Optical Coherence Tomography Images. J Ophthalmol. 2019;2019:6319581. pmid:31093370
- 24. Pace T, Degan N, Giglio Md R, Tognetto Md D, Accardo A. A Deep Learning Method for Automatic Identification of Drusen and Macular Hole from Optical Coherence Tomography. Stud Health Technol Inform. 2022;294:565–6. pmid:35612146
- 25. Valentim CCS, Wu AK, Song W, Wang V, Cao JL, Yu S, et al. Validation of an OCT-based deep-learning algorithm for the identification of full-thickness idiopathic macular holes (FTIMH). Investigative Ophthalmology & Visual Science. 2022;63(7):2103 – F0092–2103 –F0092.
- 26. Xiao Y, Hu Y, Quan W, Yang Y, Lai W, Wang X, et al. Development and validation of a deep learning system to classify aetiology and predict anatomical outcomes of macular hole. Br J Ophthalmol. 2023;107(1):109–15. pmid:34348922
- 27. Kugelman J, Alonso-Caneiro D, Read SA, Vincent SJ, Collins MJ. Automatic segmentation of OCT retinal boundaries using recurrent neural networks and graph search. Biomed Opt Express. 2018;9(11):5759–77.
- 28. Pekala M, Joshi N, Liu TYA, Bressler NM, DeBuc DC, Burlina P. Deep learning based retinal OCT segmentation. Comput Biol Med. 2019;114:103445. pmid:31561100
- 29. Hu J, Chen Y, Yi Z. Automated segmentation of macular edema in OCT using deep neural networks. Med Image Anal. 2019;55:216–27. pmid:31096135
- 30. Roy AG, Conjeti S, Karri SPK, Sheet D, Katouzian A, Wachinger C, et al. ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks. Biomed Opt Express. 2017;8(8):3627–42. pmid:28856040
- 31. Singh VK, Kucukgoz B, Murphy DC, Xiong X, Steel DH, Obara B. Benchmarking automated detection of the retinal external limiting membrane in a 3D spectral domain optical coherence tomography image dataset of full thickness macular holes. Comput Biol Med. 2021;140:105070. pmid:34875408
- 32. Frawley J, Willcocks CG, Habib M, Geenen C, Steel DH, Obara B. Robust 3D U-Net Segmentation of Macular Holes. arXiv preprint arXiv:210301299. 2021.
- 33. Seeböck P, Romo-Bucheli D, Waldstein S, Bogunovic H, Orlando JI, Gerendas BS, et al., editors. Using CycleGANs for effectively reducing image variability across oct devices and improving retinal fluid segmentation. 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019); 2019: IEEE.
- 34.
Gomariz A, Lu H, Li YY, Albrecht T, Maunz A, Benmansour F, et al., editors. Unsupervised Domain Adaptation with Contrastive Learning for OCT Segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; 2022: Springer.
- 35. Wang J, Deng G, Li W, Chen Y, Gao F, Liu H, et al. Deep learning for quality assessment of retinal OCT images. Biomed Opt Express. 2019;10(12):6057–72. pmid:31853385
- 36. Chueh KM, Hsieh YT, Chen HH, Ma IH, Huang SL. Identification of Sex and Age from Macular Optical Coherence Tomography and Feature Analysis Using Deep Learning. Am J Ophthalmol. 2022;235:221–8.
- 37. Ng WY, Zhang S, Wang Z, Ong CJT, Gunasekeran DV, Lim GYS, et al. Updates in deep learning research in ophthalmology. Clin Sci (Lond). 2021;135(20):2357–76.
- 38. Ting DSW, Pasquale LR, Peng L, Campbell JP, Lee AY, Raman R, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167–75. pmid:30361278
- 39. Godbout M, Lachance A, Antaki F, Dirani A, Durand A. Predicting Visual Improvement after Macular Hole Surgery: a Cautionary Tale on Deep Learning with Very Limited Data. arXiv preprint arXiv:210909463. 2021.
- 40. Lachance A, Godbout M, Antaki F, Hebert M, Bourgault S, Caissie M, et al. Predicting Visual Improvement After Macular Hole Surgery: A Combined Model Using Deep Learning and Clinical Features. Transl Vis Sci Technol. 2022;11(4):6. pmid:35385045
- 41. Mendes OLC, Lucena AR, Lucena DR, Cavalcante TS, de Alexandria AR. Automatic segmentation of macular holes in optical coherence tomography images: a review. Journal of Artificial Intelligence and Systems. 2020;1(1):163–85.
- 42. Brown GT, Pugazhendhi S, Beardsley RM, Karth JW, Karth PA, Hunter AA. 25 vs. 27-gauge micro-incision vitrectomy surgery for visually significant macular membranes and full-thickness macular holes: a retrospective study. Int J Retina Vitreous. 2020;6(1):56. pmid:33292716
- 43. He K, Zhang X, Ren S, Sun J, editors. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016.
- 44. Zhou Y, Chia MA, Wagner SK, Ayhan MS, Williamson DJ, Struyven RR, et al. A foundation model for generalizable disease detection from retinal images. Nature. 2023;622(7981):156–63. pmid:37704728