In this tutorial, you will discover performance measures for evaluating time series forecasts with Python. Time series generally focus on the prediction of real values, called regression problems. Therefore the performance measures in this tutorial will focus on methods for evaluating real-valued predictions.
After completing this tutorial, you will know:
- Basic measures of forecast performance, including residual forecast error and forecast bias.
- Time series forecast error calculations that have the same units as the expected outcomes such as mean absolute error.
- Widely used error calculations that punish large errors, such as mean squared error and root mean squared error.
A. Forecast Error (or Residual Forecast Error)
The forecast error is calculated as the expected value minus the predicted value. This is called the residual error of the prediction.
forecast error = expected value − predicted value
The forecast error can be calculated for each prediction, providing a time series of forecast errors. The example below demonstrates how the forecast error can be calculated for a series of 5 predictions compared to 5 expected values.
# calculate forecast error
expected = [0.0, 0.5, 0.0, 0.5, 0.0]
predictions = [0.2, 0.4, 0.1, 0.6, 0.2]
forecast_errors = [expected[i]-predictions[i] for i in range(len(expected))]
print('Forecast Errors: %s' % forecast_errors)
-----Result-----
Forecast Errors: [-0.2, 0.09999999999999998, -0.1, -0.09999999999999998, -0.2]
B. Mean Forecast Error (or Forecast Bias)
Mean forecast error is calculated as the average of the forecast error values.
mean forecast error = mean(forecast error)
Forecast errors can be positive and negative. This means that when the average of these values is calculated, an ideal mean forecast error would be zero. A mean forecast error value other than zero suggests a tendency of the model to over forecast (negative error) or under forecast (positive error).
As such, the mean forecast error is also called the forecast bias. The forecast error can be calculated directly as the mean of the forecast values.
# calculate mean forecast error
expected = [0.0, 0.5, 0.0, 0.5, 0.0]
predictions = [0.2, 0.4, 0.1, 0.6, 0.2]
forecast_errors = [expected[i]-predictions[i] for i in range(len(expected))]
bias = sum(forecast_errors) * 1.0/len(expected)
print('Bias: %f' % bias)
-----Result-----
Bias: -0.100000
C. Mean Absolute Error
The mean absolute error, or MAE, is calculated as the average of the forecast error values, where all of the forecast values are forced to be positive.
mean absolute error = mean(abs(forecast error))
# calculate mean absolute error
from sklearn.metrics import mean_absolute_error
expected = [0.0, 0.5, 0.0, 0.5, 0.0]
predictions = [0.2, 0.4, 0.1, 0.6, 0.2]
mae = mean_absolute_error(expected, predictions)
print('MAE: %f' % mae)
-----Result-----
MAE: 0.140000
D. Mean Squared Error
The mean squared error, or MSE, is calculated as the average of the squared forecast error values.
# calculate mean squared error
from sklearn.metrics import mean_squared_error
expected = [0.0, 0.5, 0.0, 0.5, 0.0]
predictions = [0.2, 0.4, 0.1, 0.6, 0.2]
mse = mean_squared_error(expected, predictions)
print('MSE: %f' % mse)
-----Result-----
MSE: 0.022000
E. Root Mean Squared Error
# calculate root mean squared error
from sklearn.metrics import mean_squared_error
from math import sqrt
expected = [0.0, 0.5, 0.0, 0.5, 0.0]
predictions = [0.2, 0.4, 0.1, 0.6, 0.2]
mse = mean_squared_error(expected, predictions)
rmse = sqrt(mse)
print('RMSE: %f' % rmse)
-----Result-----
RMSE: 0.148324
No comments:
Post a Comment