Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Shedding light on development: Leveraging the new nightlights data to measure economic progress

  • Prachi Jhamb ,

    Contributed equally to this work with: Prachi Jhamb, Susana Ferreira

    Roles Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    pj40553@uga.edu

    Affiliation Department of Applied Economics, University of Georgia, Athens, Georgia, United States of America

  • Susana Ferreira ,

    Contributed equally to this work with: Prachi Jhamb, Susana Ferreira

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation Department of Applied Economics, University of Georgia, Athens, Georgia, United States of America

  • Patrick Stephens ,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Writing – review & editing

    ‡ PS, MS and JW also contributed equally to this work.

    Affiliation Integrative Biology, Oklahoma State University, Stillwater, Oklahoma, United States of America

  • Mekala Sundaram ,

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Writing – review & editing

    ‡ PS, MS and JW also contributed equally to this work.

    Affiliation Department of Infectious Diseases, University of Georgia, Athens, Georgia, United States of America

  • Jonathan Wilson

    Roles Conceptualization, Resources, Software

    ‡ PS, MS and JW also contributed equally to this work.

    Affiliation Department of Pathology, University of Georgia, Athens, Georgia, United States of America

Abstract

Nightlights (NTL) have been widely used as a proxy for economic activity, despite known limitations in accuracy and comparability, particularly with outdated Defense Meteorological Satellite Program (DMSP) data. The emergence of newer and more precise Visible Infrared Imaging Radiometer Suite (VIIRS) data offers potential, yet challenges persist due to temporal and spatial disparities between the two datasets. Addressing this, we employ a novel harmonized NTL dataset (VIIRS + DMSP), which provides the longest and most consistent database available to date. We evaluate the association between newly available harmonized NTL data and various indicators of economic activity at the subnational level across 34 countries in sub-Saharan Africa from 2004 to 2019. Specifically, we analyze the accuracy of the new NTL data in predicting socio-economic outcomes obtained from two sources: 1) nationally representative surveys, i.e., the household Wealth Index published by Demographic and Health Surveys, and 2) indicators derived from administrative records such as the gridded Human Development Index and Gross Domestic Product per capita. Our findings suggest that even after controlling for population density, the harmonized NTL remain a strong predictor of the wealth index. However, while urban areas show a notable association between harmonized NTL and the wealth index, this relationship is less pronounced in rural areas. Furthermore, we observe that NTL can also significantly explain variations in both GDP per capita and HDI at subnational levels.

Introduction

National-level indicators of economic and social progress may provide a bird’s eye view of a country’s performance, but they often mask the heterogeneity that exists within countries. Subnational variation in resource endowments and development outcomes can have a significant impact on people’s well-being, but their analysis is frequently overlooked due to the lack of reliable disaggregated administrative data [1].

The need for subnational data in developing countries is particularly critical in the context of infectious disease spillover and climate change. The spatial heterogeneity in socio-economic conditions, disease transmission patterns, and healthcare infrastructure within a country can have a significant impact on the effectiveness of public health response measures to control disease outbreaks [2, 3]. To advance knowledge of disease transmission and control, spatially referenced demographic information, including distinctions based on cohorts and gender, is essential [4]. Unfortunately, such data are frequently inaccessible and available only in countries that undertake comprehensive census surveys [5]. Similarly, in the context of climate change, whose impacts can be highly localized and vary across regions within the same country due to factors such as topography, climate variability, or land use, subnational data can help in identifying areas that are particularly vulnerable to climate impacts and inform resilience strategies [6]. Without subnational data, researchers in the past have relied on national level development indicators which is problematic since development indicators can conceal significant intra-national variation, particularly in large countries [7].

The average interval between nationally representative economic surveys in half African nations is above 6.5 years, compared to a sub-annual frequency in most wealthy countries [8]. Household surveys, such as those from the Demographic and Health Surveys (DHS) program have limited repeated observations of the same location (and even the same country) over time [9, 10]. It is estimated that a given African household would appear in a household survey once in 1,000 years, making it difficult to measure changes in well-being over time at a local level [11]. For this reason, researchers have turned to alternative and non-traditional sources of data such as nightlights (NTL), daytime satellite imagery or mobile phone call detail records [1215].

NTL have been used as a proxy for economic activity in various studies based on the assumption that the amount of light at night is closely tied to the level of economic activity [1619]. The reliance on NTL intensity extends beyond measuring economic activity, with researchers employing it in various applications such as studying inequality and assessing the accuracy of official statistics in different political contexts [20]. However, concerns about the reliability of NTL data have emerged, particularly due to the extensive use of outdated and inaccurate data from the now-discontinued Defense Meteorological Satellite Program (DMSP), which ceased production in 2013 [21].

DMSP data suffer from several limitations, including blurred images, geo-location errors, and top-coding, which result in misattributed light sources and an inability to distinguish between areas of low and high light intensity. These issues undermine the accuracy of analyses, as DMSP data often aggregate light intensities, masking important details [22]. This has highlighted the need for a transition to newer and more precise data sources, such as the Visible Infrared Imaging Radiometer Suite (VIIRS).

VIIRS data, available since 2012, offers substantial improvements over DMSP, including 45 times higher spatial resolution and the elimination of blurring and geo-location errors, making it a more reliable alternative for economic analyses [21, 23]. Despite these advantages, the adoption of VIIRS in the economics literature has been limited. Two primary factors contribute to this: Firstly, DMSP data still offers a longer time series spanning from 1992 to 2013, which is advantageous for time series analysis. Secondly, comparing DMSP and VIIRS data directly presents challenges due to differences in their temporal coverage and spatial resolutions [24]. While DMSP data is available annually, ending in 2013, VIIRS offers annual data only for 2015 and 2016. Monthly VIIRS data is available from 2012 onwards, but comparing monthly VIIRS data with annual DMSP data introduces potential inaccuracies due to differing data processing methodologies [24].

To address these challenges, this paper utilizes a novel harmonized NTL dataset developed to bridge the gap between DMSP and VIIRS datasets [25]. This dataset creates a continuous and consistent NTL time series from 1992 to 2021 with high spatial resolution (see S1 Appendix), enabling direct comparisons across global regions.

Our first contribution lies in testing the accuracy of the harmonized NTL dataset in measuring economic activity at a small spatial scale for countries in sub-Saharan Africa. Through this, we aim to provide insights into the effectiveness of the harmonized NTL dataset for economic analyses at fine spatial resolutions in developing regions. Previously, researchers tested the dataset’s accuracy for predicting regional GDP in 2012 for Colombia [26]. We build on this work by focusing on subnational socio-economic activity in sub-Saharan Africa. We rely on household surveys conducted by the DHS as our primary source of information on subnational socio-economic activity. These surveys assess living conditions and household possessions and are available for multiple years and countries. It is worth noting that most previous studies evaluating the accuracy of satellite data compare them against the wealth index derived from DHS data [11, 12, 27, 28].

Our second contribution is that we control for population density. Criticisms of using NTL as a proxy for economic activity highlight the likelihood of high correlation between the density of light and population density [9]. However, few studies so far have evaluated the extent to which NTL vary independently of population density.

Our third contribution is an exploration into whether NTL can also serve as a proxy for human development outcomes, an area that has been largely underexplored in existing literature. While there are notable exceptions, such as [28, 29], these studies did not utilize the new harmonized NTL dataset. Moreover, their analysis of HDI relied on data constructed using indicators from the DHS itself, such as the wealth index, education, and health. In contrast, our analysis employs alternative datasets, which provide annual gridded data for HDI and GDP per capita, derived entirely from macroeconomic indicators rather than satellite imagery or household surveys. Specifically, we utilize datasets that provide annual gridded data for GDP per capita and the Human Development Index (HDI) (encompassing health, education, and standard of living) [7]. Importantly, our analysis benefits from the fact that the subnational indicators of economic activity, namely HDI and GDP per capita, are derived from macro-economic indicators and are not computed using satellite imagery or household surveys.

The efficacy of NTL as a proxy for economic activity, particularly in rural areas, has been a subject of debate in the literature. While some studies found NTL to be unreliable proxies for GDP in rural settings in Indonesia [30], others argue for their reliability in rural areas of Colombia [26]. Finally, our study contributes to this ongoing debate by investigating the heterogeneity in the variation captured by the new harmonized NTL separately in urban versus rural areas and examining the usefulness of NTL as a proxy for economic activities at subnational levels in rural areas.

By quantifying the degree to which the newly available harmonized NTL data correlate with several indicators of economic activity at the subnational level (i.e. household wealth index derived from the DHS, and gridded HDI and GDP per capita) across 34 countries in sub-Saharan Africa, we aim to provide insights into the effectiveness of NTL as a proxy for economic activity. We also consider the degree to which NTL captures additional information beyond simple population density. Table 1 summarizes the datasets utilized in this study, along with their sources and temporal coverage.

Our focus on sub-Saharan Africa is motivated by its unique challenges in obtaining accurate subnational socioeconomic data and its inherent significance as a region experiencing rapid population growth and ongoing economic changes [33], along with a high risk of emerging human diseases [34].

Materials and methods

Data

Demographic and Health Surveys (DHS) wealth index.

The primary dependent variable in our study is derived from the DHS, which are a series of nationally representative and standardized household surveys conducted in low-income countries in Africa and other regions since the 1980s to monitor and evaluate population, health, and nutrition programs. Our sample includes DHS surveys that have geo-coordinates to match them to NTL data. For a comprehensive list of the countries included in our sample, along with the corresponding years of data collection, please refer to S1 Table.

The DHS data are repeated cross-sections rather than a panel survey since a new sample of clusters is drawn for each round for the same country. The geographic information is available at the level of survey clusters or Primary Sampling Units (PSUs) rather than at the level of households. The clusters are categorized into urban and rural groups and the cluster location is reported with latitude/longitude coordinates [9]. Most cluster locations are measured using GPS while only some are measured by information from gazetteers [35]. However, to ensure anonymity, cluster location points are randomly displaced by up to 2 kms for urban clusters and by 5 kms for rural clusters, with less than 1 percent of rural clusters displaced by up to 10 km [28].

Our primary variable of interest is the DHS wealth index, which is derived from data on households’ ownership of a selected set of assets such as a phone, radio, car, TV, or motorbike; dwelling characteristics like the number of rooms occupied in a home, flooring material, access to electricity, type of drinking water source, as well as other characteristics related to the wealth status. Specifically, we use the Household Recode (HR) survey data for each country, which contains all household attributes, alongside the corresponding GPS dataset for the same year and phase (for details on construction of DHS wealth index, see [31]).

The DHS wealth index is the most widely used variable to capture poverty in studies analyzing predictors of economic well-being [11, 27]. This is because, given the limited availability of comprehensive socioeconomic indicators with high spatial resolution across a wide range of developing countries, the DHS wealth index is considered the most suitable option [35]. However, due to reasons listed in the previous section, its coverage is patchy. S1 Table shows that in our sample of 34 countries between 2004 and 2019, Rwanda was surveyed in most years (5), followed by Ethiopia (4), with most countries surveyed only once.

The household-level wealth index factor score is calculated by analyzing the asset ownership of each individual through principal component analysis. A categorical household wealth index variable ranging from 1 to 5 is derived from the household-level wealth factor score where 1 represents the lowest asset levels or “poorest” households and 5 represents the highest asset levels or “richest” households. However, as stated above, the DHS doesn’t provide household locations but rather the average location of a group of households which is referred to as a household cluster. Therefore, we extract the wealth index across all surveys in our sample and create average wealth index across all households within each cluster.

Fig 1 displays the geographical distribution of DHS clusters included in our sample. Each point on the map represents a cluster, and its color indicates the wealth category of the households surveyed. The countries shaded in gray represent those included in our sample. It is important to note that the DHS wealth index is a relative measure of wealth. It is constructed using country-specific methodologies, which limits its applicability for cross-country comparisons. Therefore, interpretations of the wealth index should be confined to within-country comparisons.

thumbnail
Fig 1. Geographic distribution of DHS survey clusters in Africa.

Base map data from the spData package in R, derived from Natural Earth (public domain). Household Wealth Index data is derived from DHS. The figure depicts geographical distribution of DHS clusters in the sample, with points colored by household wealth category (Poorest to Richest). Countries shaded in gray represent those included in the study. The figure was created by the author.

https://doi.org/10.1371/journal.pone.0318482.g001

Nightlights data.

We use a harmonized NTL dataset [25], which allows us to expand the time-period for our study from 2004 to 2019. The dataset is available yearly, downloadable as rasters at a spatial resolution of 30 arc-seconds (~1 km). Unlike the raw data from the two primary sources—DMSP (1992–2013) and VIIRS (2013–2019)—which are not directly comparable due to limited temporal overlap and differing spatial resolutions, this harmonized dataset bridges these gaps by simulating DMSP-like data from VIIRS. For details on the harmonization process and data sources, please refer to S1 Appendix.

By using this dataset, we not only extend the temporal coverage but also enhance the sensitivity of our nighttime light indicator. Their approach, particularly the simulation of DMSP-like data using VIIRS, improves the capture of medium-to-low light emissions, making it valuable for investigating local development impacts and economic output in Africa [36].

We aggregate these rasters at a coarser resolution of 10 km2 to match them with DHS household wealth data using the geographic coordinates at the cluster level in the DHS surveys. To provide additional context, we overlay country boundaries on the NTL data in Fig 2. This overlay allows us to visualize the within-country variation in NTL and compare it with the within-country variation of the Wealth Index as well as HDI and GDP per capita.

thumbnail
Fig 2. Geolocated DHS clusters in Africa, colored by NTL.

Base map data from the spData package in R, derived from Natural Earth (public domain). Nighttime lights data (NTL) is derived from Li et al. (2020). The figure depicts geolocated DHS clusters across Africa, overlaid with harmonized NTL data. Clusters are colored by NTL intensity (low to high), with darker shades representing higher light emissions. Countries included in the study are shaded in gray. The figure was created by the author.

https://doi.org/10.1371/journal.pone.0318482.g002

Population density data.

Population density data were sourced from the CIESIN Gridded Population of the World (GPWv4) [32] at a spatial resolution of 5 km, approximately 2.5 arc minutes, covering intervals of 5 years. For the years between these intervals, we implemented linear interpolation to estimate population density following the methods from previous research [37]. Similar to NTL, these were then aggregated to 10 km2 and combined with DHS data for the years 2004 to 2019.

Alternative dependent variables: HDI and GDP per capita.

Along with the DHS wealth index, we use two alternative dependent variables: GDP per capita and the HDI. GDP per capita stands as a core metric for gauging economic performance and is widely employed as an indicator of average living standards or economic prosperity. HDI is often utilized to categorize countries based on their human development levels (health, educational attainment and standard of living) [38], and is a more holistic measure compared to just income or wealth alone. However, the annual release of official global HDI estimates by the Human Development Report Office of the United Nations Development Programme (UNDP) is limited to highly aggregated national level estimates, hindering its application in scenarios requiring sub-national details. Despite being considered a more meaningful metric than income alone, HDI has not replaced income measures for assessing development progress within countries due to lack of availability at sub-national levels. Additionally, the reliance on slow, infrequent, and costly global-scale ground-based data collection for all current HDI estimates severely limits their practical usability beyond cross-national rankings [39].

Recognizing this limitation, subnational annual gridded datasets for GDP per capita and HDI have been produced for the whole world at a spatial resolution of 5 arc-min level (10 km) [7]. For the GDP per capita, these datasets combine both sub-national (based on [40]) and national datasets (from the World Bank dataset and CIA’s World Factbook). Priority is given to reported sub-national data, followed by the utilization of interpolated and extrapolated sub-national data, in conjunction with national averages. For HDI, scaling factors are devised to integrate information from both sub-national and national sources, using national-level data (from UNDP) and subnational-level (from UNDP and census reports where available).

We extracted both these datasets (from [7]) for the years for which they were available for our sample (2004 to 2015). Below, we plot two maps showing the variation in GDP per capita (Fig 3) and HDI data (Fig 4) for the DHS cluster locations. We divide the data into 5 bins of equal intervals (similar to 5 bins of the wealth index) to plot them. Note that there is very little variation within countries for both datasets. This goes against the motivation of creating these datasets to explain sub-national variation.

thumbnail
Fig 3. Geolocated DHS clusters in Africa, colored by GDP per capita.

Base map data from the spData package in R, derived from Natural Earth (public domain). Gross Domestic Product per capita (GDPpc) data is from Kammu et al. (2018). The figure depicts geolocated DHS clusters across Africa, categorized by GDPpc quintiles. Darker colors indicate higher income levels. Countries shaded in gray represent those included in the study. The figure was created by the author.

https://doi.org/10.1371/journal.pone.0318482.g003

thumbnail
Fig 4. Geolocated DHS clusters in Africa, colored by HDI.

Base map data from the spData package in R, derived from Natural Earth (public domain). Human Development Index (HDI) is sourced from Kammu et al. (2018). The figure depicts geolocated DHS clusters in Africa, categorized by HDI quintiles. Darker colors indicate higher levels of human development. Countries shaded in gray represent those included in the analysis. The figure was created by the author.

https://doi.org/10.1371/journal.pone.0318482.g004

Descriptive statistics.

Table 2 presents summary statistics for the full sample in panel (a), for the urban sample in panel (b), and for the rural sample in panel (c). On average, urban areas emit approximately five times more light (23) than rural areas (5.7). The standard deviation of 13 indicates that there is a lower degree of variability in the level of NTL across different rural areas compared to 21 in urban areas. Similarly, urban areas are, on average, over nine times more densely populated (3,250) than rural areas (363). The standard deviation of 984 in rural areas is also smaller than that in urban areas.

In Fig 5, we present the Spearman correlation coefficient matrix for our variables. As anticipated, there is a strong positive correlation between NTL and population density (0.7). Similarly, we find a high correlation between the HDI and GDP per capita (0.8). When examining the wealth index, NTL display a stronger correlation (0.7) compared to population density (0.5). Additionally, both HDI and GDP per capita show a positive correlation of 0.5 with the wealth index. Notably, while NTL exhibit a relatively high correlation (0.5) with HDI and GDP per capita, population density shows a lower correlation of 0.2 with both indicators.

thumbnail
Fig 5. Correlation plot.

Author’s calculations for pairwise correlations among key variables. Wealth Index, Nightime Light (NTL), Population Density (PopDen), Human Development Index (HDI), and Gross Domestic Product per capita (GDPpc). Wealth Index data is derived from DHS. NTL data is derived from Li et al. (2020). PopDen is sourced from the GPWv4, HDI and GDPpc data are both from Kummu et al. (2018). For more details, see Table 1. The size and color intensity of the circles indicate the strength and direction of the correlations, with darker blue representing stronger positive correlations and lighter shades or red (if present) indicating weaker or negative correlations.

https://doi.org/10.1371/journal.pone.0318482.g005

Methods

Our objective is to evaluate the extent to which harmonized NTL data serve as a proxy for socio-economic outcomes at small spatial levels. To achieve this objective, we employ repeated k-fold cross-validation to assess the predictive accuracy of our models. This method involves randomly splitting the data into k = 10 equally sized folds, and the process is repeated 10 times.

The models are trained on k-1 folds and tested on the remaining fold for each iteration of the cross-validation process. Predictions are then made on the test data, and performance metrics, such as the R-squared, are computed. This iterative process ensures that each fold is used exactly once as the test dataset. Finally, the out-of-sample R-squared is calculated as the average performance metric across all iterations of the cross-validation process. This metric represents the proportion of the variation in the response variable that is explained by the model when applied to unseen data, thereby serving as an estimate of the model’s predictive accuracy on new observations.

We begin by estimating a relationship using Ordinary Least Squares regression, which enables us to examine key associations, including the relationship between NTL with socio-economic outcomes for the pooled sample. Socio-economic outcomes are defined using the three dependent variables measured in this paper: the DHS wealth index, HDI, and GDP per capita. We estimate the models with and without controlling for population density to evaluate the contribution of NTL independently of population density.

Additionally, we examine these relationships within two distinct subsamples: urban and rural. However, we only present results for urban versus rural subsamples in the case of the DHS wealth index, as we found very little difference in the variation captured by NTL for HDI and GDP per capita between urban and rural areas.

Furthermore, to account for the unique characteristics specific to each country and year over the study period, we incorporate country and time fixed effects in all regressions, particularly in those where the DHS wealth index serves as the dependent variable. This is essential due to the lack of direct comparability of the wealth index across different countries or even repeated rounds of survey for the same country. For example, DHS surveys often employ different methodologies over time, making results from different survey rounds potentially inconsistent. For a detailed list of the survey rounds used for each country, refer to S1 Table.

We estimate the following relationship: (1) where Ycjt represents the socio-economic outcomes in cluster c, country j, and year t, which is measured by three dependent variables—the DHS wealth index, HDI, and GDP per capita. The right hand side variables include: NTLcjt which is night light intensity; populationdensitycjt which is population density; Γj and Γt are country and year fixed effects, respectively, and ecjt is the error term. Adding country and year fixed effects allows us to explore the relationship of NTL with socio-economic outcomes within specific country and year groups, effectively examining variations within a single survey. Standard errors are clustered at the same level of fixed effects (country and year) which will adjust for potential correlations between error terms within the same country or year.

In the main analysis, all regressions with HDI and GDP per capita as the dependent variables include both country and year fixed effects. However, for robustness, we also present results in the (S2S7 Tables) for models estimated under alternative specifications: (i) without fixed effects; (ii) with only country fixed effects; and (iii) with only year fixed effects.

We observe that the NTL, population density, and GDP per capita data exhibit right-skewed distributions. To address this skewness, we initially consider using the logarithm of their values as a smoothing technique. However, due to a considerable proportion of zero observations, dropping these observations from our samples is not feasible. To address this challenge while still accommodating zero-valued observations, we opt for the Inverse Hyperbolic Sine (IHS) transformation. This preserves the properties of the log transformation while allowing us to retain zero-valued observations [41].

Results

Table 3 displays the outcomes of regressing the DHS wealth index on harmonized NTL, both without and with population density (Columns 1 and 3 respectively). Additionally, Column (2) shows the results of regressing the DHS wealth index only on population density. Since NTL and population density have been transformed using the IHS transformation, we can interpret them as percentage changes.

thumbnail
Table 3. OLS regression results with country and year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.t003

The findings in Table 3 reveal that variations in NTL alone significantly explain variations in the wealth index, yielding an Out-of-Sample R-Squared (OOS-R Squared) of 48% (Column 1). When population density is included as a control variable (Column 3), the OOS-R Squared increases slightly to 49%, suggesting that models incorporating both NTL and population density marginally outperform those without population density.

It is important to note that the relationships estimated do not imply causation; the findings only indicate that higher levels of NTL are associated with higher wealth, holding all other factors constant. All specifications in the analysis include country and year fixed effects, allowing for within-country comparisons. This is essential because the wealth index is not comparable across countries.

To address potential concerns about spatial autocorrelation in the residuals, we conducted supplementary analyses using spatial fixed effects models. Due to computational constraints, spatial models were estimated on a subset of the data. Importantly, the results from these spatial models (see S8 Table) were consistent with the primary analysis presented in Table 3, further reinforcing the robustness of our findings.

It is crucial to conduct separate analyses for urban and rural areas due to differences in their economic activity types, their population density, and lighting characteristics. Previous research [30] observed that satellite-detected NTL mainly represent urban economic activity, consisting of concentrated street lamps and industrial facilities typical of urban settings, while such lights are rarely found in rural villages. To address this, we divided our sample into urban and rural regions and presented the results separately in Table 4. The table includes the effects of NTL, population density, and their combined influence on wealth in urban versus rural areas. While previous research [30] only used VIIRS data for two years and focused exclusively on Indonesia, our analysis utilizes the harmonized NTL dataset (DMSP + VIIRS) for 34 countries, spanning 2004 to 2019. This harmonized dataset enables us to study long-term changes in nighttime light intensity, addressing inconsistencies between the two sources.

thumbnail
Table 4. OLS regression for wealth index in urban v/s rural areas.

https://doi.org/10.1371/journal.pone.0318482.t004

Our findings (from Table 4) show that the share of variation in the wealth index explained by NTL in urban areas (43%) (Column 3) is over two times that in rural areas (21%) (Column 6). Similarly, the explanatory power in models with population density is over two times for urban areas (36%) (Column 4) as compared to rural (16%) (Column 7). Furthermore, examining the direction and relative strength of correlations reveals that both NTL and population density exhibit positive correlations with the wealth index, with the NTL consistently demonstrating a significantly stronger correlation across all models.

In both urban and rural settings, the inclusion of population density as a control variable leads to a reduction in the magnitude of the coefficient on NTL but it remains significant. This observation is likely attributable to the fact that although part of the variation in NTL is absorbed by variation in population density within the DHS clusters, NTL still retains substantial information about the wealth index that surpasses the influence of population density. The estimates presented in Table 4 are consistent with the notion that lights serve as a useful proxy for urban economic activity. Additionally, variation in lights explains a modest proportion (21%) of the variation in the wealth index in rural areas. Overall, our findings using the harmonized NTL data are broadly consistent with previous research (see [30]) for both urban and rural areas. For instance, in their study for Indonesia [30], they found that for the urban sector, the relationship between NTL and economic activity remains positive regardless of the type of NTL data used. However, in rural areas, economic activity was negatively related to DMSP NTL, while it was positively but imprecisely related to VIIRS NTL. They reported very small R-squared values for rural areas—3% using DMSP and 1% using VIIRS—with the results for VIIRS being statistically insignificant. In contrast, they found much higher R-squared values for urban areas—36% using DMSP and 68% using VIIRS—suggesting that NTL is a better proxy for urban than rural economic activity.

Our results outperform theirs for both DMSP and VIIRS, likely due to the use of harmonized NTL data, which addresses inconsistencies between the two sources. However, it is important to note that the dependent variable in earlier research [30] is regional GDP, whereas our analysis focuses on the DHS wealth index. Thus, the results are not directly comparable.

Next, we estimated Eq 1 using the GDP per capita and Human Development Index (HDI) as the dependent variables in Tables 5 and 6 respectively. In Table 5, we employ the new Harmonized Nighttime Lights (NTL) dataset alongside population density to analyze the variation in GDP per capita within a specific country and year group, incorporating country and year fixed effects.

thumbnail
Table 5. OLS regression for GDP per capita with country and year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.t005

thumbnail
Table 6. OLS regression for HDI with country and year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.t006

Given the transformation of NTL, population density, and GDP per capita using the Inverse Hyperbolic Sine (IHS) transformation, the coefficient estimates are interpreted as elasticities. The estimation of the model in column 1, yields a precisely estimated elasticity of NTL of 0.035, accompanied by a high OOS R-square of 93 percent. However, it is important to note that these results should not be interpreted as causal. Instead, they suggest that a 1% increase in NTL is associated with a 3.4% higher GDP per capita, all else being equal. Similarly, a 1% increase in population density is associated with a 2% higher GDP per capita.

The observed variation in GDP per capita is primarily driven by between-country differences rather than within-country disparities. Specifically, controlling for baseline differences between countries through country fixed effects alone explains 91% of the observed variation, as detailed in S2 Table. Furthermore, the consistent OOS R-square across models, regardless of the inclusion of population density, indicates that NTL possesses significant predictive power on its own, independent of population density. In fact, when both NTL and population density are included in the regression (Column 3), population density is not statistically significant and even exhibits a sign inconsistent with expectations.

We use an alternative indicator of sub-national economic development. The HDI is often considered a more meaningful metric than income or wealth alone. However, there are few studies [28, 29] that have tried to explain how well NTLs predict HDI. Using the new sub-national global HDI dataset ([7]), we study the variation in HDI explained by NTL and population density within a specific country and year group, employing country and year fixed effects (Table 6).

The model estimated in column 1 explains a substantial proportion of the variation in HDI, with a high OOS R-square of 98%. This suggests that nightlights serve as a suitable proxy for HDI, given significant correlations across all specifications. As with GDP per capita in Table 5, after including population density in the regression (column 3), NTL continues being positively and significantly correlated with HDI, while population density exhibits a wrong sign (and now is statistically significant).

A previous study [28] examined potential channels through which a positive association between NTL and HDI may have taken effect, highlighting improved schooling outcomes, lower infant mortality rates, and increased local economic activity in areas with higher NTL. In our models, the inclusion of only country fixed effects accounts for up to 92% of the variation in HDI, as detailed in S3 Table. This highlights that the minimal within-country variation in HDI is primarily driven by geographical heterogeneity. Between-country differences, rather than within-country disparities, explain a substantial portion of the observed variation in HDI.

Conclusion

Access to consistent and accurate data over long time periods is crucial for understanding trends and formulating informed policy decisions at regional or local levels. For example, analyzing local economic growth patterns or assessing the effects of climate change on regional ecosystems necessitates spatially disaggregated, historical data. NTL data can serve as valuable proxies for measuring development at local levels where alternative indicators may be lacking. The DMSP (discontinued in 2013) and the VIIRS are two widely utilized series of NTL data. While VIIRS data offer improvements over DMSP, including higher precision and updated technology, DMSP is still widely used in economics literature due to its longer time series spanning from 1992 to 2013. Newly available harmonized NTL data are therefore essential to bridging the gap between DMSP and VIIRS datasets.

We utilized a harmonized dataset [25], which integrates DMSP and VIIRS data to offer a continuous NTL time series from 1992 to 2021. This dataset, characterized by its relatively new processing technology and higher precision [42], provides a consistent framework for analyzing economic trends with high spatial resolution. To the best of our knowledge, our research is the first to test the accuracy of the harmonized NTL dataset in measuring economic activity using household wealth index from the DHS, and alternatively through other indicators of subnational economic development; namely, gridded HDI and GDP per-capita.

Through our analysis, we make four key contributions to the literature. First, we test the accuracy of the new harmonized NTL dataset in measuring economic activity at a small spatial scale in developing country municipalities. Second, we assess the extent to which NTL vary independently of population density. Third, we explore whether NTL can serve as a proxy for human development (HDI) beyond wealth or income, an area largely unexplored in previous research. Lastly, our study adds to the ongoing debate on the utility of NTL as a proxy for economic activity at subnational levels, specifically focusing on urban versus rural areas.

In evaluating predictive performance of our models on unseen data (new observations), we use k-fold repeated cross validation to calculate out-of-sample R-squared. The results demonstrate that the harmonized NTL data can serve as valuable indicators for studying wealth index at local levels, with strong explanatory power. Moreover, models that control for population density slightly increase the variation explained by NTL in wealth index. Our analysis of the relationship between NTL and wealth index across urban and rural settings reveal that the share of variation explained by NTL in wealth index is two times higher in urban areas as compared to rural areas. This finding is in line with the literature that finds NTL to be a reliable proxy only for urban areas. Furthermore, using a unique global gridded dataset for both GDP per capita and HDI, our study demonstrates that NTL serve as reliable proxies for both GDP per capita and the HDI at subnational levels. Importantly, both estimated coefficients and predictive power of NTL remains significant even after controlling for population density. Additionally, we note that the observed high variation explained in our models explaining both GDP per capita and HDI is primarily driven by between-country differences rather than within-country differences.

The implications of this research extend beyond the field of economics to various disciplines requiring sub-national data on economic prosperity. For instance, in the context of climate change, extreme weather events have highly localized impacts that are often obscured when using aggregated GDP data. Using the new harmonized NTL data, which allows studying long time-periods and captures localized economic activity, can help researchers gain a better understanding of the effects of climate change at sub-national levels. Overall, the insights provided in this study can guide researchers in selecting appropriate indicators for their specific contexts and research objectives. This, in turn, contributes to more informed decision-making and policy formulation.

Supporting information

S1 Table. List of countries and DHS waves in the sample.

https://doi.org/10.1371/journal.pone.0318482.s003

(DOCX)

S2 Table. OLS regression for GDP per capita using country fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s004

(DOCX)

S3 Table. OLS regression for HDI using country fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s005

(DOCX)

S4 Table. OLS regression for GDP per capita with year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s006

(DOCX)

S5 Table. OLS regression for HDI with year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s007

(DOCX)

S6 Table. OLS regression for GDP.

No fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s008

(DOCX)

S7 Table. OLS regression for HDI.

No fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s009

(DOCX)

S8 Table. Results from the spatial model with country and year fixed effects.

https://doi.org/10.1371/journal.pone.0318482.s010

(DOCX)

Acknowledgments

The authors sincerely thank Dr. Shanjukta Nath for her guidance in selecting the methodology for this study. We are also deeply grateful to Dr. Nicole Gottdenker for her invaluable comments and constructive feedback, which greatly improved this work. Additionally, we would like to extend our heartfelt thanks to Dr. Aditi Kadam for her generous assistance during the initial phases of this project.

References

  1. 1. Avendano R, Culley C, Balitrand C. Data and diagnostics to leave-no-one-behind. Development Cooperation Report. 2018. https://www.oecd-ilibrary.org/development/development-co-operation-report-2018/data-and-diagnostics-to-leave-no-one-behind_dcr-2018-10-en
  2. 2. Anbarci N, Escaleras M, Register CA. From cholera outbreaks to pandemics: the role of poverty and inequality. The American Economist. 2012 May;57(1):21–31.
  3. 3. Mee P, Alexander N, Mayaud P, Gonzalez FD, Abbott S, de Souza Santos AA, et al. Tracking the emergence of disparities in the subnational spread of COVID-19 in Brazil using an online application for real-time data visualisation: A longitudinal analysis. The Lancet Regional Health–Americas. 2022 Jan 1;5. pmid:35098203
  4. 4. Tatem AJ, Adamo S, Bharti N, Burgert CR, Castro M, Dorelien A, et al. Mapping populations at risk: improving spatial demographic data for infectious disease modeling and metric derivation. Population health metrics. 2012 Dec;10:1–4. pmid:22591595
  5. 5. Noy I, Doan N, Ferrarini B, Park D. Measuring the economic risk of epidemics.
  6. 6. Hallegatte S, Rozenberg J. Climate change through a poverty lens. Nature Climate Change. 2017 Apr;7(4):250–6.
  7. 7. Kummu M, Taka M, Guillaume JH. Gridded global datasets for gross domestic product and Human Development Index over 1990–2015. Scientific data. 2018 Feb 6;5(1):1–5. pmid:29406518
  8. 8. Burke M, Driscoll A, Lobell DB, Ermon S. Using satellite imagery to understand and promote sustainable development. Science. 2021 Mar 19;371(6535):eabe8628. pmid:33737462
  9. 9. Weidmann NB, Theunissen G. Estimating local inequality from nighttime lights. Remote Sensing. 2021 Nov 17;13(22):4624.
  10. 10. World Bank. World development report 2021: Data for better lives. https://openknowledge.worldbank.org/entities/publication/7a8f3bf4-c1ca-5512-bb16-7dcd5eb71007
  11. 11. Yeh C, Perez A, Driscoll A, Azzari G, Tang Z, Lobell D, et al. Using publicly available satellite imagery and deep learning to understand economic well-being in Africa. Nature communications. 2020 May 22;11(1):2583. pmid:32444658
  12. 12. Blumenstock J, Cadamuro G, On R. Predicting poverty and wealth from mobile phone metadata. Science. 2015 Nov 27;350(6264):1073–6. pmid:26612950
  13. 13. Blumenstock JE. Fighting poverty with data. Science. 2016 Aug 19;353(6301):753–4. pmid:27540154
  14. 14. Steele JE, Sundsøy PR, Pezzulo C, Alegana VA, Bird TJ, Blumenstock J, et al. Mapping poverty using mobile phone and satellite data. Journal of The Royal Society Interface. 2017 Feb 28;14(127):20160690. pmid:28148765
  15. 15. Steele JE, Pezzulo C, Albert M, Brooks CJ, zu Erbach-Schoenberg E, O’Connor SB, et al. Mobility and phone call behavior explain patterns in poverty at high-resolution across multiple settings. Humanities and Social Sciences Communications. 2021 Nov 22;8(1):1–2.
  16. 16. Chen X, Nordhaus WD. Using luminosity data as a proxy for economic statistics. Proceedings of the National Academy of Sciences. 2011 May 24;108(21):8589–94. pmid:21576474
  17. 17. Henderson JV, Storeygard A, Weil DN. Measuring economic growth from outer space. American economic review. 2012 Apr 1;102(2):994–1028. pmid:25067841
  18. 18. Keola S, Andersson M, Hall O. Monitoring economic development from space: using nighttime light and land cover data to measure economic growth. World Development. 2015 Feb 1;66:322–34.
  19. 19. Hodler R, Raschky PA. Regional favoritism. The Quarterly Journal of Economics. 2014 May 1;129(2):995–1033.
  20. 20. Bundervoet T, Maiyo L, Sanghi A. Bright lights, big cities: measuring national and subnational economic growth in Africa from outer space, with an application to Kenya and Rwanda. World Bank Policy Research Working Paper. 2015 Oct 28(7461).
  21. 21. Gibson J. Better night lights data, for longer. Oxford Bulletin of Economics and Statistics. 2021 Jun;83(3):770–91.
  22. 22. Gibson J, Olivia S, Boe-Gibson G. Night lights in economics: Sources and uses 1. Journal of Economic Surveys. 2020 Dec;34(5):955–80.
  23. 23. Elvidge CD, Zhizhin M, Hsu FC, Baugh K. What is so great about nighttime VIIRS data for the detection and characterization of combustion sources. Proceedings of the Asia-Pacific Advanced Network. 2013 Jun 10;35(0):33.
  24. 24. Yong Z, Li K, Xiong J, Cheng W, Wang Z, Sun H, et al. Integrating DMSP-OLS and NPP-VIIRS nighttime light data to evaluate poverty in Southwestern China. Remote Sensing. 2022 Jan 26;14(3):600.
  25. 25. Li X, Zhou Y, Zhao M, Zhao X. Harmonization of DMSP and VIIRS nighttime light data from 1992–2021 at the global scale. Figshare. Scientific Data. 2020;7:168. Available from:
  26. 26. Pérez-Sindín XS, Chen TH, Prishchepov AV. Are night-time lights a good proxy of economic activity in rural areas in middle and low-income countries? Examining the empirical evidence from Colombia. Remote Sensing Applications: Society and Environment. 2021 Nov 1;24:100647.
  27. 27. Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S. Combining satellite imagery and machine learning to predict poverty. Science. 2016 Aug 19;353(6301):790–4. pmid:27540167
  28. 28. Bruederle A, Hodler R. Nighttime lights as a proxy for human development at the local level. PloS one. 2018 Sep 5;13(9):e0202231. pmid:30183707
  29. 29. Head A, Manguin M, Tran N, Blumenstock JE. Can human development be measured with satellite imagery?. Ictd. 2017 Nov 16;17:16–9.
  30. 30. Gibson J, Olivia S, Boe-Gibson G, Li C. Which night lights data should we use in economics, and where?. Journal of Development Economics. 2021 Mar 1;149:102602.
  31. 31. Rutstein SO, Johnson K. The DHS Wealth Index. DHS Comparative Reports No. 6. Calverton (MD): ORC Macro; 2004.
  32. 32. Center For International Earth Science Information Network-CIESIN-Columbia University (2016). Gridded Population of the World, Version 4 (GPWv4): Population Count [Data set]. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/H4X63JVC. Accessed 15 June 2023.
  33. 33. Akintunde TS, Oladeji PA. Population dynamics and economic growth in Sub-Saharan Africa. Population. 2013;4(13):148–57.
  34. 34. Jones KE, Patel NG, Levy MA, Storeygard A, Balk D, Gittleman JL, et al. Global trends in emerging infectious diseases. Nature. 2008 Feb;451(7181):990–3. pmid:18288193
  35. 35. Weidmann NB, Schutte S. Using night light emissions for the prediction of local wealth. Journal of Peace Research. 2017 Mar;54(2):125–40.
  36. 36. Schon J, Koren O. Introducing AfroGrid, a unified framework for environmental conflict research in Africa. Scientific data. 2022 Mar 29;9(1):116. pmid:35351878
  37. 37. Schmidt JP, Park AW, Kramer AM, Han BA, Alexander LW, Drake JM. Spatiotemporal fluctuations and triggers of Ebola virus spillover. Emerging Infectious Diseases. 2017 Mar;23(3):415. pmid:28221131
  38. 38. UNDP (United Nations Development Programme). Human Development Report 2023–24: Breaking the Gridlock: Reimagining Cooperation in a Polarized World. New York: UNDP; 2024. https://hdr.undp.org/content/human-development-report-2023-24
  39. 39. Sherman L, Proctor J, Druckenmiller H, Tapia H, Hsiang SM. Global high-resolution estimates of the United Nations Human Development Index using satellite imagery and machine-learning. National Bureau of Economic Research; 2023 Mar 20.
  40. 40. Gennaioli N, La Porta R, Lopez-de-Silanes F, Shleifer A. Human capital and regional development. The Quarterly journal of economics. 2013 Feb 1;128(1):105–64.
  41. 41. Bellemare MF, Barrett CB, Just DR. The welfare impacts of commodity price volatility: Evidence from rural Ethiopia. Duke University and Cornell University. 2011.
  42. 42. Sono D, Wei Y, Chen Z, Jin Y. Spatiotemporal evolution of West Africa’s urban landscape characteristics applying harmonized DMSP-OLS and NPP-VIIRS nighttime light (NTL) data. Chinese Geographical Science. 2022 Dec;32(6):933–45. pmid:36408473