An analysis on Hidden Markov Model prediction factors using California wildfire and meteorological records

A stochastic prediction modeling method such as Hidden Markov Model (HMM) is a complex system. The HMM is suitable to analyze and predict time-dependent phenomena and has been widely used in applications such as speech recognition, weather and stock market forecasts. In California, forest wild fires are a form of natural calamity. The causes of wildfires have many factors and studies for a prediction method are much needed. We believe that the occurrences of wild fires can be a time-dependent phenomenon and HMM is a good candidate for predicting wildfire probabilities. California wildfire and weather datasets are used in HMM to generate wildfire prediction outcomes and then compared to historic records. This project focuses on two approaches. First approach is to train two HMMs. One model that can identify patterns that cause high number of wildfires. Another model that identifies patterns that cause low number of wildfires. Using the trained HMM parameters and the observations in test data, likelihood value is computed for each category of HMM. Prediction is done by comparing the likelihood values and then compared with the actual historical data. Second approach is to predict trends in future wildfires. In this approach, an HMM is trained using "n" years of data. This training creates hidden state sequences inside the model. For predicting "n+1th" year, likelihood values are computed over a range of possible observation values. The one with the maximum likelihood value, is predicted to be the most likely "next" observation.