We scraped typhoons data from Weather Underground. Analyzing 3,102 paths, we note a general downward trend in average latitude (North-South direction) from 1949 to 2016. Maximum wind speeds peaked in the 1950s and 2010s. Finally, a machine learning typhoon path and wind speed forecasting model was developed.

## Data Exploration

Data of typhoons and tropical depressions in the Pacific from 1949 to 2016 were scraped from Weather Underground (https://www.wunderground.com/). The dataset includes latitudes, longitudes, wind speeds, and pressures. The dataset was cleaned and preprocessed prior to analysis. The final dataset contains a total of 3,102 typhoons and tropical depressions.

The plot below shows the paths typhoons and tropical depressions in the Pacific from 1949 to 2016. The color indicates the year, from blue in 1949 to red in 2016.

The charts below show the changes in average latitude, longitude, and wind speeds over time. There is an observable downtrend in the latitude, which means that more typhoons and tropical depressions tend to move into lower latitudes in recent years.

The chart below shows the average longitude that shows no discernible trend over the years.

Maximum wind speeds (mph) follow a u-curve, peaking in the 1950s to 1960s and in 2010s.

## Typhoon Path Prediction Using Machine Learning

Using a filtered dataset containing typhoons from 1975 to 2012, a Gradient Boosting Model (GBM) was developed that can predict paths and wind speeds of typhoons given the first three days since inception. The model could predict typhoon paths with 95% accuracy in the next 24 hours and 90% accuracy in the next 48 hours using $r^2$ on aggregated typhoon locations.  Wind speeds could also be predicted with 77% accuracy in the next 48 hours. Sample predictions (yellow) superimposed with actual paths (blue) are shown below.

## Summary

Data of Typhoons and tropical depressions in the Pacific from 1949 to 2016 were scraped from Weather Underground. Key findings are: 1) there is a visible downward trend in the average latitude of the typhoons over the years, 2) maximum wind speeds over time follow a U-shape peaking in the 1950s and 2010 onwards.

Finally, a gradient boosting machines model was trained on historical data that could predict typhoon paths and wind speeds. Further research can include collecting data from other reliable sources for validation, improving accuracy, and using other accuracy metrics such as mean absolute percentage error (MAPE). Other models such as recurrent neural networks may be explored.

Note that the results of this analysis and forecasting model are based on data from Weather Underground which may be incomplete. Data from the Philippine weather bureau may be explored in future studies.

## Contributors

Javier, P. J. E., Yodico, J. Yap, Sashmir