Figures
Abstract
The formation of metastases is driven by the ability of cancer cells to disseminate from the site of the primary tumour to target organs. The process of dissemination is constrained by anatomical features such as the flow of blood and lymph in the circulatory system. We exploit this fact in a stochastic network model of metastasis formation, in which only anatomically feasible routes of dissemination are considered. By fitting this model to two different clinical datasets (tongue & ovarian cancer) we show that incidence data can be modelled using a small number of biologically meaningful parameters. The fitted models reveal site specific relative rates of dissemination and also allow for patient-specific predictions of metastatic involvement based on primary tumour location and stage. Applied to other data sets this type of model could yield insight about seed-soil effects, and could also be used in a clinical setting to provide personalised predictions about the extent of metastatic spread.
Author summary
For most cancer patients the occurrence of metastases equals incurable disease. Despite this fact our quantitative knowledge about the process of metastatic dissemination is limited. In this manuscript we improve on a previously published mathematical model by incorporating known biological facts about metastatic spread and also consider the temporal dimension of dissemination. The model is fit to two different cancer types with very different patterns of spread, which highlights the versatility of our framework. Properly parametrised this type of model can be used for making personalised predictions about metastatic burden.
Citation: Gerlee P, Johansson M (2019) Inferring rates of metastatic dissemination using stochastic network models. PLoS Comput Biol 15(4): e1006868. https://doi.org/10.1371/journal.pcbi.1006868
Editor: Natalia L. Komarova, University of California Irvine, UNITED STATES
Received: June 20, 2018; Accepted: February 14, 2019; Published: April 1, 2019
Copyright: © 2019 Gerlee, Johansson. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data on ovarian cancer is publicly available from SEER 18 Regs Research Data + Hurricane Katrina Impacted Louisiana Cases, Nov 2016 Sub (2000-2014). The data on tongue cancer is supplied as Supplementary data.
Funding: The work of PG was funded by Vetenskapsrådet (www.vr.se) grant no. 2014-6095 and Stiftelsen för strategisk forskning (www.stratresearch.se) grant no. AM13-0046. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
For most forms of cancer, occurrence of distant metastases equals incurable disease [1, 2]. Regional metastases, i.e. positive lymph nodes, also implies an inferior prognosis [3, 4]. In order to diminish the risk of dissemination of the primary tumor, patients commonly receive adjuvant treatment with radiotherapy and/or some kind of medical oncological treatment. Yet, in many cases, the patient is later on faced with residual or recurrent disease [5]. On the other hand, many patients receive adjuvant treatment with subsequent side effects, even though their illness would never have disseminated [6, 7]. Hence, increased knowledge about the extent of metastasis at diagnosis would improve the care of many cancer patients.
Metastatic spread is known to follow certain disease specific patterns [8, 9], but despite the current state of knowledge, a quantitative understanding of the process of metastatic spread would improve our ability to optimise therapy and reduce over-treatment. Here will we make use of stochastic modelling to quantitatively assess the importance of different routes of dissemination. In addition, this method also makes it possible to estimate the extent of metastatic spread based on tumour stage and location. Our approach is versatile and we highlight this by applying it to two different types of tumours: carcinoma of the oral tongue, which disseminates primarily through the lymphatic system and ovarian carcinoma, which spreads both intraperitoneally, lymphatically and through the blood circulatory system.
The tendency of a primary tumour to form a metastasis is the hallmark of malignant cancer [10], and the process by which this occurs is a multi-step process [11, 12]. Typically tumour cells detach from the primary tumour, invade the surrounding stroma and find their way to local lymph nodes or blood vessels [13]. Although a late stage tumour can release very large number of circulating tumour cells (CTCs) into the blood stream (up to 4 × 106 cells shed per gram of tumour per day [14]), the low probability of forming metastatic foci [15] coupled with the low probability of passing through capillary beds [16] lead to the conclusion that micrometastases in for example the liver (for gut malignancies) and the lung (for all malignancies) are necessary for further hematogenic dissemination [17–19]. These, often microscopic [20], lesions release CTCs into the arterial side of the circulation (in the case of the lung) [21], and hence amplify CTC numbers in arterial blood, which otherwise would be low due to the filtration occurring in the lung capillary bed. A similar process is at work during lymphatic spread [22], where CTCs get trapped in lymph nodes, where they form micrometastases, which shed CTCs that travel further downstream in the lymphatic system.
This process is known as secondary seeding, and although it is of paramount importance for the metasatic process we still have a limited understanding of the steps involved. Here mathematical modelling can provide a helping hand, since it allows for a quantitative understanding of biological processes that are difficult to measure directly.
Mathematical modelling of metastasis dates back to a series of seminal papers by Liotta and coworkers [23–25]. They considered the release of CTCs from an implanted tumour in mice and the subsequent formation of lung metastases. Using both deterministic and probabilistic methods they could derive predictions of how the number of metastatic foci changes over time and how the probability of being free of metastasis changed [24]. These predictions agreed well with experimental data and highlights the stochastic yet predictable nature of metastatic spread.
Another important contribution was made by Iwata et al. [26] who formulated and analysed a model which accounts for secondary seeding. That model predicts how the size distribution of metastases changes as the disease progresses. Predictions of the model were tested by Baratchart et al. [27] in a murine model of renal carcinoma, and while the model could predict the total metastatic burden, it was unable to describe the size distribution of metastases. However, the model prediction could be improved by assuming interactions between metastatic foci.
The Iwata model has also been used for connecting presurgical primary tumor volume and postsurgical metastatic burden and survival [28]. The model was applied to two datasets from mouse models and one clinical dataset and the analysis revealed a highly nonlinear relationship between resected primary tumor size and metastatic recurrence.
A similar approach has been used by Hanin and coworkers in a model which accounts for the growth rate of the primary tumour, shedding of metastases, their selection, latency and growth in a given secondary site [29]. They proved that the parameters of the model are identifiable in the case of Gompertzian growth of the primary tumour, and apply a maximum likelihood method to identify the model parameters in the case of a single patient with a large number of lung metastases.
Other mathematical models have successfully been applied in order to describe the dynamics of metastatic spread in pancreatic cancer [30] and breast cancer [31].
The importance of secondary seeding was investigated by Scott et al. [19] in the context of self-seeding [32], the process whereby a primary tumour can accelerate its own growth by releasing CTCs that return to the site of origin. A careful mathematical treatment of this hypothesis showed that secondary seeding is indeed required for this pathway to contribute to primary growth.
Secondary seeding also has an impact on estimates of metastatic efficiency, since undetected micrometastatic lesions render the apparent spread directly from the primary to target sites more efficient than it actually is [18].
The idea of metastatic spread occurring on a network, where the nodes represent organs and the links correspond to routes of spread was first described by Scott et al. [33], and later modelled quantitatively by Newton et al. [34, 35]. They considered a stochastic model where the dissemination of cancer cells is modeled as an ensemble of random walkers on the network. The dynamics of the model is determined by a transition matrix, which was obtained by fitting the model to a large autopsy data set [36]. The entries of this transition matrix give information about rates of dissemination between different organs.
Here we build on the work of Newton et al., but with secondary seeding in mind, make modifications which alleviate the problem underdetermination, which plagued their work. In order to parametrise their model they had to assume that the observed patterns of metastasis correspond to a steady-state distribution of the model. We instead make use of primary tumour stage to create a model which is temporal and considerably easier to parametrise.
Focusing on two primary tumours we show that our model is able to estimate the dissemination rates with high confidence, and crucially allows us to estimate rates between sites, which are inaccessible if one simply analyses incidence data.
Results
Our aim is to quantify the rate of metastatic spreading by applying a stochastic network model to clinical data. We will consider two different data sets: (i) A cohort of 141 patients diagnosed with tongue cancer where metastases mainly occur in the head and neck region and (ii) data from patients with ovarian cancer obtained from the SEER-database where metastases occur both regionally and to some extent in distant organs. Regional spread typically occurs via the lymphatic system and metastases appear in lymph nodes (LN), whereas distant metastases are mediated by the blood circulatory system and appear in other organs such as the liver or the lung. Although the two processes are different in some respects they also share many commonalities. Firstly, the spread of the disseminated tumour cells is constrained by anatomical structures, i.e. the lymphatic system and the blood circulatory system. Secondly, the spread is directed since it is subject to the flow in the system, and lastly, the formation of metastases at secondary sites affects the downstream rate of formation since metastases, like the primary tumour, disseminate tumour cells as they grow. These similarities makes it possible to formulate a general mathematical model, which can be tailored to describe both data sets.
The model consists of a network with sites and directed links [33]. The nodes correspond to lymph node stations or organs, and can be in either of two states: negative (i.e. containing no metastases) or positive (containing one or more metastases). The state of each node is a random variable, and the probability of it being positive depends on the state of the other nodes, and the rates of shedding between the nodes.
We assume that a positive site sheds CTCs that flow according to the links of the network. This applies both to the primary tumour and subsequent metastases. For simplicity we do not consider the size of metastases in this model, since this would require more parameters [37], and would make it more difficult to fit the model to clinical data. This implies that the state of each site is a binary variable taking the value 0 if the site is empty and 1 if it contains a metastasis.
We assume that each cancer cell which is disseminated from a site has the same probability of forming a metastasis at a downstream site. Since it is known that filtration rates are high (only 1 in 10 000 cells pass through a capillary bed) [19] we assume that CTCs only flow to neighbouring sites, e.g. CTCs released on the venous side of the circulatory system can only give rise to metastases in the lung, since this is the first capillary bed they encounter. This is also a good approximation for lymphatic spread, where the occurrence of skip metastases, in which intermediary LNs are negative, is rare [38], and has been suggested to be the results of undetectable micrometastases [38]. We aggregate the rate of release of CTCs, the survival probability in the circulatory system and the probability of forming a metastasis in a downstream site into a single rate parameter, which we call λ, when flow is from the primary tumour, and ϕ, when flow is from metastatic sites. This parameter then corresponds to the rate at which a downstream site becomes positive given a positive site upstream. We assume that the primary tumour (and secondary lesions) start shedding CTCs upon formation, which means that we assume a parallel progression model [39], as opposed to a linear model where shedding from the primary is delayed and occurs after malignant progression.
We include a temporal dimension into the model by using the primary tumour stage as a proxy for time. This means that we consider the stage as informative about the total number of CTCs that have been disseminated from the initiation of the tumour up until diagnosis. This is of course a crude approximation since it is known that tumours progress at different rates and preferentially disseminate to different sites depending on for example which mutations they harbour [40]. Although tumours might progress at different rates they still have to pass all the intermediary stages, and very rarely regress to a lower stage. If we knew the time since tumour initiation for each patient we would able to infer the rates in absolute terms (e.g. units per year). However, since we only have information about the tumour stage, and in addition do not have a mapping from real time to tumour stage, the rates that we infer from clinical data are only relative, but given a primary tumour of a certain stage the model can still predict the probability of metastases at different sites. The relative magnitude of the rates also informs us about the risk of developing new metastases given a certain metastatic burden. This approach was also taken by Benson et al. who used stage as a proxy for time in a model of metastatic spread in head and neck cancers [41]. In order to account for the fact that tumour volume, and therefore stage, typically changes non-linearly with time we also infer the flow rates under the assumption of exponential tumour growth (see Methods).
We now move on to discuss the models specific to tongue cancer and ovarian cancer.
Lymphatic spread of tongue cancer
The drainage of the lymphatic system in the head and neck area can be described by a network where the nodes correspond to lymph node stations and the links represent flow between the stations. The location of the primary tumour determines how much it drains into the different stations, Tumours of the tongue mainly drain into station I and II, and to some extent to station III. Station I drains into II, which in turn drains into III, which drains into IV [38]. The dissemination network is shown in Fig 1, where the directed links show how CTCs flow in the system.
The cancer cells flow from the primary tumour to lymph node station I,II and III with rates λI, λII and λIII respectively. The flow between lymph node stations is defined by the rates ϕI, ϕII and ϕIII.
Given a primary tumour metastases will eventually form in downstream stations, but the dynamics of the model crucially depend on the parameters (λI, λII, λIII, ϕI, ϕII, ϕIII). The aim is therefore to estimate these from clinical data. Our data set contains 141 patients with carcinoma of the oral tongue diagnosed between 2004 and 2014 and treated at the department of Oncology at Sahlgrenska University Hospital in Gothenburg (see Methods and S1 Data). For each patient we have information about the stage of the primary tumour and the presence/absence of metastasis in LN station I-IV.
In our model we assume that secondary seeding is responsible for all metastases that occur in sites not directly connected to the primary tumour. This implies that we expect all patients with station IV positive to also be positive for station III. This is true for all but 2% of the cases (3 patients), and we therefore exclude these cases from further analysis. Another option would be correct for potentially undetected micrometastases in station III, but we decided for the more cautious option of exclusion.
Given the assumption outlined above about a constant downstream flow of cancer cells and the necessity of secondary seeding the state of the system evolves according to a continuous-time Markov chain with state space that corresponds to all possible states of metastatic spread (see Methods for details). The flow rates in the network dictate transition rates for the Markov chain and the probability distribution over the states changes according to a master equation. Since we know the initial state of the system (no metastasis at tumour initiation) we can, given a set of parameter values, numerically solve the master equation and obtain the probabilities of all the metastatic states for all future times. This allows us to estimate the model parameters by comparing the metastatic state of each patient with solutions to the master equation and computing the likelihood of the data given certain parameter values (see Methods).
In order to visually compare the parametrised model outcome to the clinical data we transform the data in the following way. Let us first focus on station I and let denote the number of patients with stage t primary tumours that are positive in station I. Here t can take the values 1, 2, 3, 4 corresponding to primary tumour stage T1, T2, T3 and T4. Let Nt denote the total number of patients with stage t disease. The fraction of patients with stage t tumour with positive lymph node at station I is then given by
(1)
The same procedure is applied to stations II-IV yielding stage-dependent fractions.
The fraction can be interpreted as the probability of finding a patient with stage t disease with a metastasis in LN station i. This quantity can readily be calculated from the model (see Methods for details).
A comparison of the clinical data and the model fit is shown in Fig 2, where each panel correspond to a LN station (I-IV). The estimated dissemination rates are shown in Table 1 together with 95% confidence intervals obtained using parametric bootstrap (see Methods). It is worth noting that the confidence intervals for ϕI and ϕIII obtained from bootstrapping shows that the variability in flow from the station I to station II is large, ranging from practically zero to 0.42, the largest of all rates. Also the flow from station III to station IV exhibits a large variability. With the current amount of the data we can conclude that flow from station II to III has a large impact on the metastatic process, whereas we are unable to ascertain the importance of flow from station I to II and from III to IV.
Each panel corresponds to a lymph node station and shows the probability of finding a metastases in the lymph node as a function of the primary tumour stage. The parameters of the model are given in Table 1.
Metastasis in ovarian cancer
Ovarian cancer has the highest mortality rate of the gynecological cancers and a majority of patients are diagnosed in an advanced stage [42]. Ovarian cancer predominantly metastasises within the peritoneal cavity and through the pelvic lymph nodes [43]. In the peritoneal cavity cancer cells metastasize through a process commonly described as transcoelomic dissemination, where the cancer cells loose cell-cell contact and exfoliate into the peritoneal cavity. They float in the peritoneal fluid and are spread across the peritoneal cavity, where they attach to the peritoneal organs and form a metastatic tumour [44]. Ascites produced in the peritoneal cavity is drained through lymph vessels in the diaphragm [45], enabling cancer cells to enter into the blood circulation. Historically, hematogenous metastasis has been regarded as occurring only in late stages of ovarian cancer. Recent work however, suggest that this mode of dissemination may be more common than previously thought [46]. In the case of distant metastases, the most common sites are liver, lung, brain and skin [47].
The data set on ovarian cancer was obtained from the SEER-database (see Methods). As of 2010 SEER contains information about the presence or absence of metastases at diagnosis in liver, lung, brain and bone. The status of regional LNs (including the pelvis and diaphragm) is also available.
Given the available data we did not try to model transcoelomic dissemination, and instead focused on dissemination to local LNs and hematogenic spread to the organs represented in the SEER database. From a primary tumour located in the ovaries cancer cells can thus either spread to regional LNs or via the venous blood vessels to the lungs (see Fig 3). Metastases in regional LNs also allow for dissemination to the lungs, and from there metastatic lesions shed CTCs to all organs of the body, including the liver, bone and brain (for which we have data).
The tumour cells spread either via regional lymph nodes or directly to the lung where they form metastases. From there further dissemination to the liver, bone and brain occurs.
Again our aim is to estimate the dissemination parameters (λ1, λ2, ϕ1, ϕ2, ϕ3, ϕ4) from clinical data. In this case the data set is considerably larger containing 16 055 patients diagnosed with ovarian cancer. Again we exclude patients that exhibit skip metastases, leaving us with 15 536 cases (3% of patients are excluded).
The primary tumour stage for this data is more refined and each T-stage is divided into three substages a,b and c, giving us in total 9 different stages (T1a-T3c). We estimate the parameters of this model in precisely the same way as for the tongue cancer model (see Methods). Again we compare the parametrised model to the clinical data by calculating the probability of finding a metastasis in each of the sites considered (regional LN, lung, liver, brain and bone) as a function of the primary tumour stage, using Eq (1). A comparison between the data and the model is shown in Fig 4, which shows that the model is able to recapitulate the overall behaviour of the data. Due to the low incidence for bone and brain (in total 22 and 5 cases respectively) the data is quite noisy and the model represents a poor fit for those sites. The parameter values are collected in Table 2 together with 95% confidence intervals obtained using parametric bootstrap (see Methods). For this data set the uncertainty in the inferred parameter values is considerably smaller, which can be attributed to the almost hundred-fold larger data set.
Each panel corresponds to a site/organ and shows the probability of finding a metastases at the site as a function of the primary tumour stage. The parameters of the model are given in Table 2.
Discussion
We have presented a novel method for inferring the rates of metastatic dissemination, and shown that one can obtain reliable estimates (with small CIs) for both lymphatic and hematogeneous spread. Our work builds on previous mathematical models [34], but with additional biological knowledge we simplify the model structure and make parameter estimates more reliable. Firstly, we make use of the fact that high filtration rates imply that secondary seeding is responsible for metastatic spread beyond the first capillary bed/lymph node. This implies that we do not need to consider all possible links between the sites (e.g. no direct link between ovary and liver). Secondly, we make use of known anatomical structures and flow directions to further prune the network (e.g. the flow in the lymphatic system dictates the topology of the tongue cancer network, see Fig 1). Lastly, we make use of primary tumour stage as a proxy for time, which means that we can resolve the data temporally. This implies that we do not have to rely on assumptions about stationarity of the underlying metastatic process, and instead fit the parameters for the time-dependent problem. We assumed that tumour stage maps linearly to an arbitrary time scale, which implies that the inferred dissemination rates are estimated with arbitrary and unknown units. In order to investigate how this assumption affected the inferred parameter values we also considered a model with exponential growth of the primary tumour (see Methods). That model resulted in similar parameter values, which suggest that our approach is robust to assumptions about tumour growth dynamics.
Our method makes it possible to infer dissemination rates between sites and also their confidence intervals. The model does not fit the data perfectly, and although this may have to do with the scarcity of the data set, we cannot exclude that this may be due to some limitations of the assumptions used in the model. Traditionally this type of data is analysed by looking at incidence rates [48]. Our analysis goes beyond this by disentangling incidence rates into dissemination rates between different lymph node stations/organs. This means that we can quantify processes that at a first glance seem inaccessible, but which appropriate assumptions and modelling techniques can help us reveal.
It could be argued that we have used an excessively complex model to fit a data set, which could be described with four (and five) straight lines (see Figs 2 and 4), and hence four (and five) parameters corresponding to the slopes of the lines. However, such a model would have no connection to the underlying biology and would be unable to say anything about the dynamics that generated the data. In our model on the other hand the parameters have an immediate biological interpretation. They correspond to the (relative) flow rates of cancer cells between different anatomical sites, numbers that could be of interest to clinicians when deciding on the extent of surgical resection or radiotherapy. The magnitudes of the flow rates also reveal the importance of different routes of spread. For example for ovarian cancer we can conclude that the rate of metastasis formation in the regional lymph nodes is ten times higher than the direct spread to the lungs.
The flow rates in our model factor in not only physical flow, but also the ability of cancer cells to survive during transport to the target site, and their ability to form metastases in the target site. Assuming that the survival in the circulatory system is independent of target site, knowledge of physical flow would make it possible to estimate the relative rate of metastasis formation in different target sites. This would correspond to effect of the “soil” in the well-established seed-soil hypothesis [49, 50]. Unfortunately, lymphatic flow is difficult to measure [51], and there are currently no estimates of lymphatic flow in the head and neck region. Values of relative blood flow are however readily available [52], and we can use these to calculate soil-effect for ovarian cancer with respect to liver, bone and brain. The relative blood flow to these organs is given by 6.5%, 5% and 1.2% and by dividing these flow rates with ϕ2, ϕ3 and ϕ4, we get relative metastasis formation rates of 1.42, 0.3 and 0.33. From this we can conclude that the rate of metastasis formation in the liver is roughly five fold higher compared to bone and brain. This suggests that ovarian cancer cells are considerably better at colonising the liver compared to bone and brain, which is in agreement with previous data [53].
We did encounter a couple of issues when it comes to parameter identifiability. In the case of tongue cancer our model was unable to accurately identify the rates ϕI and ϕIII. In the case of ϕI the problem arises because we are dealing with a site (station II) which has flow from both the primary tumour and an upstream site (i.e. station I). With current size of the dataset (n = 141 patients) we are unable to obtain accurate estimates of the flow into this station. With a larger dataset we would most likely find more patients who are only positive for station II making it possible to obtain better estimates for λII and consequently ϕI. The large confidence interval obtained for ϕIII is due to the low number of cases with metastases in station IV. In the bootstrap procedure we generate synthetic data based on the estimated parameter values, and since the actual number of patients with station IV positive is small (only two cases) we sometimes generate data without any prevalence of metastasis in station IV. This results in estimating the flow rate to ϕIII = 0, which in turn leads to a large confidence interval. This problem does not appear for the ovarian model since, although the inferred parameter values are small, the large number of patients in the data set imply that we almost surely generate synthetic data with some patients positive for metastases in the brain or bone.
From a clinical point of view, this work could be of importance, by contributing to an increased possibility to predict the risk of future regional and/or distant metastases. Especially so in the current era, with new treatment modalities emerging and a current development towards more individualised treatment programs. Although the models are parametrised with population level data they might still be used in order to make predictions on the individual level. For example if a patient is diagnosed with tongue cancer of a specific stage the parametrised model could provide probabilities of different metastatic states, which could be factored in with other clinical data to guide treatment. It would also be possible to include the effect of treatment in the model (e.g. radiation), which would reveal potential benefits of treating metastatic sites and making the model better suited to describe clinical procedure.
In conclusion we believe that this framework for analysing metastatic spread, which incorporates known anatomical constraints and a temporal dimension, allows for novel insights and will hopefully be of assistance to both cancer biologists and clinicians in the future.
Methods
Ethics statement
For the tongue cancer data approval has been granted by Regional ethical review board in Gothenburg.
Data
Tongue cancer.
After approval from the Regional ethical review board in Gothenburg, data was obtained from the register of the department of Oncology at Sahlgrenska University Hospital in Gothenburg. Data for all patients diagnosed with carcinoma of the oral tongue between 2004 and 2014 were extracted, a total number of 141 cases. Information about primary tumour, LN metastases and distant metastases was registered according to the TNM classification system (AJCC 6th edition until 2009 and 7th edition from 2010 on). T-stage varied from T1 to T4, 78 patients had regional LN metastes (N1—N3) and only two patients had distant metastasis (M1), in the lung. For patients with positive nodal status, we collected information about in which lymph node levels (I-V) metastases were present. Only one patient exhibited metastasis in lymph node level V, and therefore this level was excluded from the analysis. Information about involved LN levels was in approximately 50% of the cases retrieved from radiological examination (CT scans or magnetic resonance imaging), why patients received primary radiotherapy or surgery was performed without neck dissection. For the remaining part this information was obtained from the pathology reports performed after primary surgery including supraomohyoid neck dissection. For each patient we had the following information:
- T-stage (derived AJCC 6th and 7th edition)
- Presence/Absence of regional LN metastases (calculated from N-stage)
- Localization of regional LN metastases (lymph node levels I-VI)
For further information about the data set please see S2 Data.
Ovarian cancer.
The data was obtained from the SEER*Stat case listing database Incidence—SEER 18 Regs Research Data + Hurricane Katrina Impacted Louisiana Cases, Nov 2016 Sub (2000-2014), which is available at https://seer.cancer.gov/seerstat/. We extracted all cases of ovarian cancer from 2010-2014 with recorded metastases in liver, lung, brain and bone. From this set of patients we selected those with a T-stage (derived AJCC 7th edition) which was in the range T1a to T3c. Information about the presence or absence of regional lymph node metastases was obtained by looking at the N-stage (derived AJCC 7th edition). We considered patients with N0 to be free of regional LN metastases while all other stages to be indicative of positive LN. The total number of cases that met the above critera was 16 055 and for each case we had the following information:
- T-stage (derived AJCC 7th edition)
- Presence/Absence of regional LN metastases (calculated from N-stage)
- Presence/Absence of metastases in lung (CS-mets at DX-lung)
- Presence/Absence of metastases in liver (CS-mets at DX-liver)
- Presence/Absence of metastases in bone (CS-mets at DX-bone)
- Presence/Absence of metastases in brain (CS-mets at DX-brain)
Mathematical model
The model consists of N nodes each representing a specific site/organ (excluding the primary site). A node takes the value 0 at time t if the site is void of metastases and 1 if the site contains one or more metastases. We assume that once a site has become positive it will remain so for all future times. Since we have N nodes and each can be in two states we have a state space for the entire network that contains 2N states. Each state is denoted by a binary string (e.g. 0100 correspond to site 1,3 and 4 being negative and site 2 being positive), but for notational simplicity we enumerate them with integers from 1 to 2N.
The flow rates in the anatomical network dictate transition rates between states. We assume that a site becomes positive at a rate which equals the sum of flow rates from all positive upstream sites. Since the transition rates only depend on the current state (and not the history) the system can be described as a continuous-time Markov chain with state space {0, 1}N. If we let Pi(t), where i = 1, …, 2N, denote the probability of being in state i at time t, then these probabilities evolve according to the master equation [54]:
where Qii ≤ 0 is the rate of leaving state i and Qij is the rate of moving from state j to i. Since we know that the patient is free of metastasis at tumour intitiation (t = 0), we know that the initial condition for the master equation is given by P1(t) = 1 and Pi(t) = 0 for 1 < i ≤ 2N. Numerical solutions of this equation will be used in order to parametrise the two models.
Since we only have access to the primary tumour stage of the patients in the two data sets we make the following assumption about the mapping from tumour stage at diagnosis to time from tumour intitiation to diagnosis. For simplicity we assume that the tumour radius grows linearly with time, which is a simplification, since it is known that volume and hence radius grows non-linearily with time [37].
Tumour stage is informed by the linear size of the lesion [55], e.g. for tongue cancer T1 corresponds to a primary tumour less than 1 cm in diameter and T2 is a tumour larger than 2 cm, but less than 4 cm. This means that there is a strong correlation between tumour stage and thickness [56]. We therefore assume that time from tumour initiation depends linearly on the stage. Since we have no exact information about the details of this mapping we will for simplicity consider an arbitrary time scale and simply let the stage correspond directly to time since initiation. Below we challenge this assumption by considering a model in which the primary tumour volume grows exponentially.
Tongue cancer.
We let lymph node station I-IV correspond to node 1-4 in the network. Since we have four nodes we will have a state space of 24 = 16 metastatic states, but 4 of these will be inaccessible due to topology of the anatomical network (e.g. 1001 is not attainable) and we are left with 12 states. The transition rates are shown in Fig 5, where the nodes now represent states and the arrows show possible transitions and their corresponding ates.
Each node corresponds to a possible metastatic state and is represented by a binary string which signifies which lymph node station is positive or negative. For convenience the nodes are named from 1 to 12. The rates are obtained from the flow rates in the anatomic network (Fig 1) by considering the rate at which individual sites become positive based on the state of the other sites.
The transition matrix is given by
(2)
With P = [P1, P2…, P12]T we can write the master equation as
(3)
which we solve numerically using a first-order Euler forward scheme with initial condition P1(t) = 1 and Pi(t) = 0 for 1 < i ≤ 12, which corresponds to no metastasis at tumour initiation.
To estimate the parameters we make use of a maximum likelihood method. Preferably we would like to have the time until onset of metastasis for each station in each patient, but we only have access to data obtained at diagnosis. This information can be used in the following way: Assume that we observe a patient with a stage τ tumour that has a positive lymph node at station I and all other stations negative, i.e. the patient is in state 1000. The probability of this occurring is according to the model given by P2(τ). Now the value of P2(τ) will depend on the parameters of the model and we therefore write P2(τ, θ), where θ = [λI, λII, λIII, ϕI, ϕII, ϕIII]. The likelihood of the entire clinical data set can now be written
where sj is the metastatic state of patient j, which can take values 1,2,…,12, and τj ∈ {1, 2, 3, 4} is the primary tumour stage of that patient. In order to find the parameters θ that best fit the data we minimise the negative likelihood −L(θ) using the MATLAB function fminsearch applied to a numerical solution of (3). This gives us a maximum likelihood estimate
.
A solution to Eq (3) with parameter values obtained from the maximum likelihood method is shown in Fig 6. We note that the probability of the metastasis free state (0000) decays over time while all other probabilities increase over the considered time frame. For large (and unrealistic) times all probabilities will tend to zero expect the state corresponding to metastases in all sites (1111), which will tend to unity.
Each curve correspond to specific state of the system.
Confidence intervals for are calculated using parametric bootstrapping with 100 samples [57]. This is a numerical method in which the parameter estimates obtained from the maximum likelihood method are used for generating synthetic data, from which new maximum likelihood estimates are obtained. Repeating this process results in an empirical distribution for each parameter, and the confidence intervals are calculated as the 5% and 95% quantiles of the empirical distributions.
The probability that station I is positive (the solid curve in Fig 2 top-left panel) can be calculated from the model by summing over the probability of all metastatic states which have 1 in the first position of the binary string, i.e.
Ovarian cancer.
We let regional lymph nodes correspond to node 1 in the network, lung to node 2, liver to node 3, bone to node 4 and brain to node 5. Since we have five nodes we will have a state space of 25 = 32 metastatic states, but 14 of these are inaccessible due to topology of the anatomical network and we are left with 18 states. The transition rates are shown in Fig 7, where the nodes now represent states and the arrows show possible transitions and their corresponding rates.
Each node corresponds to a possible metastatic state and is represented by a binary string which signifies which organ is positive or negative for metastases. For convenience the nodes are named from 1 to 18. The rates are obtained from the flow rates in the anatomic network (Fig 3) by considering the rate at which individual sites become positive based on the state of the other sites.
We now have an 18-dimensional probability vector P = [P1, P2…, P18]T and we can again write down the master equation
(4)
where the matrix Q is defined as above and can be obtained from the state transition diagram Fig 7. We proceed as above and formulate the maximum likelihood problem that we solve numerically using MATLAB.
A solution to Eq (4) with parameter values obtained from the maximum likelihood method is shown in Fig 8. We note that the dynamics are dominated by the metastasis free state (00000) and the state with only regional LNs positive (10000), while all other states have probabilities less than 0.02. For large (and unrealistic) times all probabilities will tend to zero expect the state corresponding to metastases in all sites (11111), which will tend to unity.
Each curve correspond to specific state of the system.
The probability of finding a patient with metastasis in the regional lymph nodes can be expressed by summing the probability over all states in which node 1 is positive
Exponential tumour growth.
In order to assess the impact of our assumption that primary tumour stage can be used as a proxy for time we also consider a model of tongue cancer metastasis in which the times corresponding to the different tumour stages are calculated assuming an exponential growth in tumour volume.
If we assume that tumour volume grows according to
Assuming the tumour to be spherical, the radius, which is used for tumour staging [55], grows as
with
. If we let Ri denote the radius of a stage i tumour we get that the time at which stage i is reached is given by
(5)
The initial radius of the tumour is set to be equal to the volume of a single cell R0 = 25 μm and the growth rate is estimated from clinical studies to year−1 [58].
The clinical guidelines for determining tumour stage of tongue cancer involve the size of the lesion but also other factors such as degree of invasion of surrounding tissue [55]. We simplify this by considering the following mapping from tumour stage to tumour size. We let a T1 tumour have a radius of 2 cm, a T2 tumour have radius 4 cm, a T3 tumour have radius 6 cm and a T4 tumour have radius 8 cm. Of course this is highly simplistic but at least provides us with a reasonable quantification.
This leads us to a mapping from tumour stage to time from initiation which is shown in Table 3.
We now use these values (instead of the identity mapping from stage to time) in our maximum likelihood estimation and obtain the parameter values shown in Table 4. The parameters obtained with the exponential growth model are generally slightly smaller, although ϕI is an exception with a larger value compared to the linear model. But more importantly the order of their magnitudes is the same for both models, i.e. ϕII > λII > ϕI > λI > ϕIII > λIII for both models.
This suggests that our model of metastatic spread is robust to different assumption about tumour growth dynamics.
Supporting information
S1 Data. Information about the clinical cohort of tongue cancer patients.
https://doi.org/10.1371/journal.pcbi.1006868.s001
(DOCX)
S2 Data. The data used for fitting the model of metastatic spread of tongue cancer.
https://doi.org/10.1371/journal.pcbi.1006868.s002
(DOCX)
References
- 1. Price KA, Cohen EE. Current treatment options for metastatic head and neck cancer. Current treatment options in oncology. 2012;13(1):35–46. pmid:22252884
- 2. Deng K, Yang C, Tan Q, Song W, Lu M, Zhao W, et al. Sites of distant metastases and overall survival in ovarian cancer: A study of 1481 patients. Gynecologic oncology. 2018;150(3):460–465. pmid:30001833
- 3. Bello IO, Soini Y, Salo T. Prognostic evaluation of oral tongue cancer: means, markers and perspectives (II). Oral oncology. 2010;46(9):636–643. pmid:20637679
- 4. Dahm-Kähler P, Borgfeldt C, Holmberg E, Staf C, Falconer H, Bjurberg M, et al. Population-based study of survival for women with serous cancer of the ovary, fallopian tube, peritoneum or undesignated origin-on behalf of the Swedish gynecological cancer group (SweGCG). Gynecologic oncology. 2017;144(1):167–173. pmid:27817932
- 5. Du Bois A, Reuss A, Pujade-Lauraine E, Harter P, Ray-Coquard I, Pfisterer J. Role of surgical outcome as prognostic factor in advanced epithelial ovarian cancer: a combined exploratory analysis of 3 prospectively randomized phase 3 multicenter trials: by the Arbeitsgemeinschaft Gynaekologische Onkologie Studiengruppe Ovarialkarzinom (AGO-OVAR) and the Groupe d’Investigateurs Nationaux Pour les Etudes des Cancers de l’Ovaire (GINECO). Cancer. 2009;115(6):1234–1244. pmid:19189349
- 6. Curigliano G, Criscitiello C, Esposito A, Pruneri G. Over-using chemotherapy in the adjuvant setting. The Breast. 2017;31:303–308. pmid:27866835
- 7. Trifiletti DM, Smith A, Mitra N, Grover S, Lukens JN, Cohen RB, et al. Beyond positive margins and extracapsular extension: evaluating the utilization and clinical impact of postoperative chemoradiotherapy in resected locally advanced head and neck cancer. Journal of Clinical Oncology. 2017;35(14):1550–1560. pmid:28475848
- 8. Weiss L, Haydock K, Pickren J, Lane W. Organ vascularity and metastatic frequency. The American journal of pathology. 1980;101(1):101. pmid:7446696
- 9. Mukherji SK, Armao D, Joshi VM. Cervical nodal metastases in squamous cell carcinoma of the head and neck: What to expect. Head & Neck. 2001;23(11):995–1005.
- 10. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. cell. 2011;144(5):646–674. pmid:21376230
- 11.
Govindan R, DeVita VT. DeVita, Hellman, and Rosenberg’s Cancer: Principles & Practice of Oncology Review. Lippincott Williams & Wilkins; 2009.
- 12. Gupta GP, Massagué J. Cancer metastasis: building a framework. Cell. 2006;127(4):679–695. pmid:17110329
- 13. Valastyan S, Weinberg RA. Tumor metastasis: molecular insights and evolving paradigms. Cell. 2011;147(2):275–292. pmid:22000009
- 14. Butler TP, Gullino PM. Quantitation of cell shedding into efferent blood of mammary adenocarcinoma. Cancer research. 1975;35(3):512–516. pmid:1090362
- 15. Al-Hajj M, Wicha MS, Benito-Hernandez A, Morrison SJ, Clarke MF. Prospective identification of tumorigenic breast cancer cells. Proceedings of the National Academy of Sciences. 2003;100(7):3983–3988.
- 16. Okumura Y, Tanaka F, Yoneda K, Hashimoto M, Takuwa T, Kondo N, et al. Circulating tumor cells in pulmonary venous blood of primary lung cancer patients. The Annals of thoracic surgery. 2009;87(6):1669–1675. pmid:19463575
- 17. Weiss L. Comments on hematogenous metastatic patterns in humans as revealed by autopsy. Clinical and Experimental Metastasis. 1992;10(3):191–199. pmid:1582089
- 18. Scott JG, Fletcher AG, Maini PK, Anderson AR, Gerlee P. A filter-flow perspective of haematogenous metastasis offers a non-genetic paradigm for personalised cancer therapy. European journal of cancer. 2014;50(17):3068–3075. pmid:25306188
- 19. Scott , Basanta D, Anderson A, Gerlee P. A mathematical model of tumour self-seeding reveals secondary metastatic deposits as drivers of primary tumour growth. Journal of The Royal Society Interface. 2013;10:1–10.
- 20. Meng S, Tripathy D, Frenkel EP, Shete S, Naftalis EZ, Huth JF, et al. Circulating tumor cells in patients with breast cancer dormancy. Clinical cancer research. 2004;10(24):8152–8162. pmid:15623589
- 21. Bross I, Viadana E, Pickren J. Do generalized metastases occur directly from the primary? Journal of chronic diseases. 1975;28(3):149–159. pmid:1123421
- 22. McMullen KP, Chalmers JJ, Lang JC, Kumar P, Jatana KR. Circulating tumor cells in head and neck cancer: A review. World journal of otorhinolaryngology-head and neck surgery. 2016;2(2):109–116. pmid:29204555
- 23. Liotta L, Saidel G, Kleinerman J. Stochastic model of metastases formation. Biometrics. 1976; p. 535–550. pmid:963169
- 24. Liotta L, Delisi C, Saidel G, Kleinerman J. Micrometastases formation: a probabilistic model. Cancer Letters. 1977;3:203–208. pmid:902256
- 25. Saidel G, Liotta L, Kleinerman J. System dynamics of a metastatic process from an implanted tumor. Journal Of Theoretical Biology. 1976;56(2):417–434. pmid:1271828
- 26. Iwata K, Kawasaki K, Shigesada N. A dynamical model for the growth and size distribution of multiple metastatic tumors. Journal Of Theoretical Biology. 2000;203(2):177–186. pmid:10704301
- 27. Baratchart E, Benzekry S, Bikfalvi A, Colin T, Cooley LS, Pineau R, et al. Computational Modelling of Metastasis Development in Renal Cell Carcinoma. PLoS Computational Biology. 2015;11(11):e1004626. pmid:26599078
- 28. Benzekry S, Tracz A, Mastri M, Corbelli R, Barbolosi D, Ebos JML. Modeling Spontaneous Metastasis following Surgery: An In Vivo-In Silico Approach. Cancer Research. 2016;76(3):535–547. pmid:26511632
- 29. Hanin L, Seidel K, Stoevesandt D. A “universal” model of metastatic cancer, its parametric forms and their identification: what can be learned from site-specific volumes of metastases. Journal of Mathematical Biology. 2015;72(6):1633–1662. pmid:26307099
- 30. Haeno H, Gonen M, Davis MB, Herman JM, Iacobuzio-Donahue CA, Michor F. Computational Modeling of Pancreatic Cancer Reveals Kinetics of Metastasis Suggesting Optimum Treatment Strategies. Cell. 2012;148(1-2):362–375. pmid:22265421
- 31. Demicheli R, Retsky MW, Swartzendruber DE, Bonadonna G. Proposal for a new model of breast cancer metastatic development. Annals of oncology: official journal of the European Society for Medical Oncology. 1997;8(11):1075–1080.
- 32. Norton L, Massagué J. Is cancer a disease of self-seeding? Nature medicine. 2006;12(8):875.
- 33. Scott J, Kuhn P, Anderson ARA. Unifying metastasis—integrating intravasation, circulation and end-organ colonization. Nature Reviews Cancer. 2012; p. 1–2.
- 34. Newton PK, Mason J, Bethel K, Bazhenova LA, Nieva J, Kuhn P. A Stochastic Markov Chain Model to Describe Lung Cancer Growth and Metastasis. PLoS ONE. 2012;7(4):e34637. pmid:22558094
- 35. Newton PK, Mason J, Bethel K, Bazhenova L, Nieva J, Norton L, et al. Spreaders and Sponges Define Metastasis in Lung Cancer: A Markov Chain Monte Carlo Mathematical Model. Cancer Research. 2013;73(9):2760–2769. pmid:23447576
- 36. Disibio G, French SW. Metastatic patterns of cancers: results from a large autopsy study. Archives of pathology & laboratory medicine. 2008;132(6):931–939.
- 37. Gerlee P. The model muddle: in search of tumor growth laws. Cancer research. 2013;73(8):2407–2411. pmid:23393201
- 38. Byers RM, Weber RS, Andrews T, McGill D, Kare R, Wolf P. Frequency and therapeutic implications of “skip metastases” in the neck from squamous carcinoma of the oral tongue. Head & neck. 1997;19(1):14–19.
- 39. Klein C. Parallel progression of primary tumours and metastases. Nature Reviews Cancer. 2009;9(4):302–312. pmid:19308069
- 40. Wang H, Zhang C, Zhang J, Kong L, Zhu H, Yu J. The prognosis analysis of different metastasis pattern in patients with different breast cancer subtypes: a SEER based study. Oncotarget. 2017;8(16):26368. pmid:28038448
- 41.
Benson N, Whipple M, Kalet IJ. A Markov model approach to predicting regional tumor spread in the lymphatic system of the head and neck. AMIA Annual Symposium proceedings AMIA Symposium. 2006; p. 31–35.
- 42. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2015. CA: a cancer journal for clinicians. 2015;65(1):5–29.
- 43. Tsuruchi N, Kamura T, Tsukamoto N, Akazawa K, Saito T, Kaku T, et al. Relationship between paraaortic lymph node involvement and intraperitoneal spread in patients with ovarian cancer—a multivariate analysis. Gynecologic oncology. 1993;49(1):51–55. pmid:8482561
- 44.
Mitra AK. Ovarian Cancer Metastasis: a Unique Mechanism of Dissemination. In: Tumor Metastasis. InTech; 2016.
- 45. Tan DS, Agarwal R, Kaye SB. Mechanisms of transcoelomic metastasis in ovarian cancer. The lancet oncology. 2006;7(11):925–934. pmid:17081918
- 46. Pradeep S, Kim SW, Wu SY, Nishimura M, Chaluvally-Raghavan P, Miyake T, et al. Hematogenous metastasis of ovarian cancer: rethinking mode of spread. Cancer cell. 2014;26(1):77–91. pmid:25026212
- 47. Cormio G, Rossi C, Cazzolla A, Resta L, Loverro G, Greco P, et al. Distant metastases in ovarian carcinoma. International Journal of Gynecological Cancer. 2003;13(2):125–129. pmid:12657111
- 48. Qiu M, Hu J, Yang D, Cosgrove DP, Xu R. Pattern of distant metastases in colorectal cancer: a SEER based study. Oncotarget. 2015;6(36):38658. pmid:26484417
- 49. Paget S. The distribution of secondary growths in cancer of the breast. 1889. Cancer metastasis reviews. 1989;8(2):98. pmid:2673568
- 50. Fidler IJ. The pathogenesis of cancer metastasis: the’seed and soil’hypothesis revisited. Nature Reviews Cancer. 2003;3(6):453. pmid:12778135
- 51. Blatter C, Meijer EFJ, Nam AS, Jones D, Bouma BE, Padera TP, et al. In vivo label-free measurement of lymph flow velocity and volumetric flow rates using Doppler optical coherence tomography. Scientific Reports. 2016; p. 1–10.
- 52. Leggett RW, Williams LR. A proposed blood circulation model for Reference Man. Health physics. 1995;69(2):187–201. pmid:7622365
- 53. Disibio G, French SW. Metastatic patterns of cancers: results from a large autopsy study. Archives of pathology & laboratory medicine. 2008;132(6):931–939.
- 54.
Gardiner C. Stochastic methods. vol. 4. Springer-Verlag Berlin Heidelberg; 2009.
- 55. Edge SB, Compton CC. The American Joint Committee on Cancer: the 7th edition of the AJCC cancer staging manual and the future of TNM. Annals of surgical oncology. 2010;17(6):1471–1474. pmid:20180029
- 56. Zia S, Naqvi SU, Adel H, Adil SO, Hussain M, et al. Relationship of Oral Tumor Thickness with the rate of lymph node metastasis in Neck based on CT Scan. Pakistan journal of medical sciences. 2017;33(2):353. pmid:28523036
- 57.
Mooney CZ, Duval RD, Duvall R. Bootstrapping: A nonparametric approach to statistical inference. 94-95. Sage; 1993.
- 58. Komatsubara H, Umeda M, Minamikawa T, Ojima Y, Yanagita M, Shigeta T, et al. A study of tumor doubling time and onset of distant metastasis from squamous cell carcinoma of the oral cavity. Journal of Japan Society for Oral Tumors. 2005;17(4):232–238.