Student Research Reports
A predictive model for West Nile Virus surveillance by detecting inland water eutrophication using Sentinel-2 imagery
Country:United States of America
Student(s):Sarah Blackett, Ishaan Verma, Salil Khare, Benjamin Kwait-Gonchar, and Daisy Li
Grade Level:Secondary School (grades 9-12, ages 14-18)
GLOBE Educator(s):Cassie Soeffing
Contributors:Dr. Rusty Low, IGES, scientist
Peder Nelson, OSU, sme
Dr. Erika Podest, NASA JPL, scientist
Andrew Clark, IGES, EO Researcher and Data Analyst
Peer Mentor: Ria Jain
Report Type(s):International Virtual Science Symposium Report, Mission Mosquito Report
Protocols:Earth As a System, Mosquitoes
Presentation Video:
View Video
Presentation Poster:
View Document
Language(s):English
Date Submitted:01/25/2023
Monitoring mosquito abundance and its contributing indices are crucial for controlling West Nile Virus (WNV), the most prevalent vector-borne disease in the contiguous United States. As mosquito-borne diseases like West Nile Virus (WNV) are primarily transmitted through infected mosquitoes, it is a good indicator of mosquito prevalence in an area. Empirical data from our field research showed a direct correlation between fertilizer concentration and mosquito larvae population in a small trap experiment. However, detection of the presence of fertilizer in water bodies across counties and states requires site sampling, which is very time-consuming and expensive to perform. This research is based on the well-documented correlation between significant fertilizer presence in a water system and algae blooms. Detection of algae in inland water could provide an early warning signal in controlling vector-borne diseases such as the West Nile Virus (WNV). Remote sensing and satellite imagery provide a cost-effective alternative for monitoring inland water bodies such as rivers, lakes, water reservoirs, ponds, etc. We developed a supervised machine learning model using the Naïve Bayes algorithm to predict WNV breakout by detecting algae from Sentinel-2 MSI images. The model was trained using high spatial resolution products (20m) from Sentinel-2 satellites over Sacramento, California. Methods applied for algae bloom extraction from Sentinel-2 MSI images, with a high spatial resolution, are based on an estimation of Chlorophyll-a (Chla) and the use of the Normalized Difference Chlorophyll Index (NDCI), which is widely used for ocean color data. To suppress the chlorophyll from vegetation in a satellite image, a combination of NDCI and Normalized Difference Water Index (NDWI) was used to measure algae presence in water bodies. A time series dataset was developed using Sentinel-2 images from 2017-2021 for algae bloom information. The training dataset was further enriched with feature sets such as water%, vegetation%, algae observed/reported in the public domain, and the California Department of Public Health’s West Nile Virus 2006-Present dataset. The accuracy of the ML predictive model ranges from 0.7 to 0.95, depending on the algorithm and the length of time series used for the training of the model. Our research also validated the time lag between algae bloom and actual detection of the WNV virus reported through public health departments. With additional training data, this model can be extended to predict potential WNV outbreaks for any given county using satellite images.
(Keywords: Machine Learning, Mosquito Abundance, Algae, Eutrophication, Fertilizer Runoff, Remote Sensing, West Nile Virus, Sentinel-2, SVM, Support Vector Machine, Naïve Bayes, Prediction Model)