Date of Award
January 2025
Document Type
Open Access Thesis
Degree Name
Master of Public Health (MPH)
Department
School of Public Health
First Advisor
Bhramar Mukherjee
Second Advisor
Nathan Grubaugh
Abstract
Dengue fever is a growing global public health threat, especially in tropical and subtropical regions such as Sri Lanka, where seasonal outbreaks impose recurring burdens on health systems. Accurate and timely forecasting of dengue incidence is essential for public health preparedness, yet most existing models suffer from key limitations: they rely on historical case trends without accounting for real-time ecological dynamics, and often require entomological surveillance data that is not consistently available. This thesis addresses these gaps by proposing a machine learning forecasting framework that integrates remote sensing-derived environmental proxies to estimate latent mosquito vector abundance.Using Principal Component Analysis (PCA), this study constructs a univariate environmental suitability index z(t) from four remotely sensed covariates: NDVI, NDWI, rainfall, and temperature. First differences Δz(t) are also computed to capture ecological shifts preceding outbreaks. These indices, combined with lagged dengue case counts, serve as input features for a Random Forest regressor trained to predict weekly dengue incidence in three climatically distinct districts of Sri Lanka: Colombo, Kandy, and Trincomalee. Model performance is evaluated against seasonal ARIMA baselines and shows that the Random Forest model consistently outperforms traditional time series approaches, particularly in capturing peak magnitude and outbreak timing. Wavelet analysis confirms the seasonality structure used in model design. Feature importance analysis further reveals that both recent case trends and lagged environmental changes contribute meaningfully to forecast accuracy. These results validate the hypothesis that remote sensing-derived proxies can enhance dengue prediction in the absence of entomological data. The findings support the development of generalizable, ecologically grounded early warning systems for dengue and underscore the value of combining machine learning with biologically informed feature engineering. This approach is particularly well-suited for low-resource settings where direct vector surveillance is infeasible but satellite data are readily available.
Recommended Citation
Ye, Shuyu, "Enhancing Dengue Forecasting With Remotely Sensing-Derived Vector Proxies And Random Forest Models: A Case Study For Sri Lanka" (2025). Public Health Theses. 2570.
https://elischolar.library.yale.edu/ysphtdl/2570

This Article is Open Access
Comments
This is an Open Access Thesis.