top of page

Technical Notes

The following data sources were used to train the model:

​

Dengue case reports: Philippine Integrated Disease Surveillance System (PIDSR) data from 2008-2022 from 12 Philippine cities provided by the Department of Health (DOH)

​

Weather data: Manila Observatory provided data from CAMS (PM2.5), ERA5 (temperature, relative humidity and wind speed data), CHIRPS (rainfall), and MCD19A3CMG (NDVI) climate data sources. For further information, you may find documentation for all variables here (https://thinkingmachines.github.io/project-cchain/), or you may contact Manila Observatory for specific data processing and transformations conducted for this dataset.

  • Weekly Maximum Temperature (°C) - Temperature data was derived from the 2m-temperature data from ERA5, where Tmax was determined by taking the maximum temperature per day (00:00 PHT - 23:00 PHT).

  • Weekly Rainfall (mm/day) - Precipitation is extracted from CHIRPS, which provides total precipitation accumulated from 00:00Z to 23:59Z. PR was extracted per location and no further processing was performed. Note that CHIRPS is a land-only dataset. Additionally, CHIRPS was selected as the source of data for precipitation after validation with observed station data.

  • Weekly Wind Speed (km/h)  - The wind speed is derived from the eastward and northward components of surface winds. ERA5 provided the hourly u- and v-components of 10m winds used to calculate WS following this formula: WS = (WSx2 + WSy2)1⁄2 where WSx and WSy are the u- and v-components, respectively. UTC was converted to PHT by adding 8 hours to the time. The daily values of WSx and WSy were determined by taking the average per day (00:00 PHT - 23:00 PHT). Afterwards, the daily WS was calculated. Weekly wind speed was calculated as the barangay land area-weighted average of daily wind speed values over a week.

Technical Notes

Model Type:

Outcome:

Method for Defining an Outbreak:

Weather Variables:

Generalized Linear Mixed Model (GLMM) considers both fixed (weather) and random effects (location and time variations).

Predicts the probability of a dengue outbreak (yes/no).

1. An outbreak is a period of increased disease activity within a specific population. It's identified by analyzing Incidence Rate (IR) data over time.

2. Here's how we determined it:

  • Median Epidemic Curve: For a particular city and disease, we calculated the median IR for each week across several years of data. This creates a central tendency line representing the typical pattern of disease incidence over time in that location.

  • 75th Percentile Threshold: We then calculate the 75th percentile of the values on the median epidemic curve. This percentile represents a level of IR that's higher than 75% of the observed weekly IRs for that disease in that city.

  • Outbreak Definition: A week is considered an outbreak week if the observed IR for that week is equal to or greater than the 75th percentile threshold established from the median epidemic curve.

  • In simpler terms, an outbreak signifies a period where the disease incidence exceeds what's typically expected based on historical data for that location and disease, considering the population size through the use of IR. This method helps account for seasonal variations and population changes over time.

​

The model uses maximum temperature, rainfall, and wind speed, all lagged by 4 weeks (i.e., weather data from 4 weeks prior).

Interpretation:

1. Lower maximum temperature 4 weeks before is linked to a higher chance of an outbreak.

2. Higher rainfall 4 weeks before is linked to a slightly higher chance of an outbreak.

3. Wind speed 4 weeks before doesn't significantly influence outbreak prediction.

Model Performance:

The model moderately (AUC = 0.74) discriminates between outbreak and non-outbreak scenarios. It has an accuracy of 82%, i.e., 82% of predictions made by the model are correct. This model is good at identifying existing outbreaks (97% sensitivity). However, it sometimes predicts outbreaks that don't happen (50% specificity). When it predicts an outbreak, there's a good chance it's real (80% PPV). But when it predicts no outbreak, there's still a small chance an outbreak might occur (90% NPV).

Disclaimer:

The model is a guide and other factors can influence outbreaks.

Click the hyperlink for the R Script of the Prediction Tool.

bottom of page