Effects of Missing Data Patterns and Rates on the Accuracy of ARIMA Model for Electricity Consumption Forecasting

Main Article Content

Nassamon Bootwisas
Uparittha Intarasat

Abstract

This study aims to investigate the effects of missing data patterns and missing rates on the forecasting accuracy of the ARIMA model from a methodological perspective using monthly electricity consumption data in Thailand from 2002 to 2025. A full factorial design was employed to simulate missing data under the Missing Completely at Random (MCAR) and Missing at Random (MAR) mechanisms at missing rates of 10%, 20%, and 30%, with 50 replications conducted for each experimental condition. Three widely used imputation methods in time series forecasting were applied, namely Last Observation Carried Forward (LOCF), Linear interpolation, and Kalman filtering. The results reveal that the imputation method is the most influential factor affecting forecasting accuracy, followed by the missing data rate. Kalman filtering consistently produced the lowest Root Mean Squared Error (RMSE) and demonstrated high stability across experimental conditions, whereas Linear interpolation consistently yielded the highest RMSE values. In addition, the performance of certain methods, particularly LOCF, varied substantially according to the proportion of missing data. These findings suggest that Kalman filtering is a robust and appropriate approach for handling incomplete energy time series data and can effectively support long-term energy consumption planning under data uncertainty.

Article Details

How to Cite
Bootwisas, N., & Intarasat, U. (2026). Effects of Missing Data Patterns and Rates on the Accuracy of ARIMA Model for Electricity Consumption Forecasting . Journal of Science Ladkrabang, 35(1), 35–54. retrieved from https://li01.tci-thaijo.org/index.php/science_kmitl/article/view/269931
Section
Research article

References

Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: Forecasting and control (5th ed.). John Wiley & Sons.

Chhabra, G. (2023). Comparison of imputation methods for univariate time series. International Journal on Recent and Innovation Trends in Computing and Communication, 11(2s), 286-292. https://doi.org/10.17762/ijritcc.v11i2s.6148

De Bézenac, E., Rangapuram, S. S., Benidis, K., Bohlke-Schneider, M., Kurle, R., Stella, L., Hilmkil, H., Januschowski, T., & Gasthaus, J. (2020). Normalizing Kalman filters for multivariate time series analysis. Proceedings of the 34th International Conference on Neural Information Processing Systems (pp. 2995-3007). Curran Associates Inc.

Duarte, O., Duarte, J. E., & Rosero-Garcia, J. (2024). Data imputation in electricity consumption profiles through shape modeling with autoencoders. Mathematics, 12(19), Article 3004. https://doi.org/10.3390/math12193004

Durbin, J., & Koopman, S. J. (2012). Time series analysis by state space methods (2nd ed.). Oxford University Press.

Fung, D. S. (2006). Methods for the estimation of missing values in time series [Master’s thesis, Edith Cowan University]. https://ro.ecu.edu.au/theses/63

Hussain, A., Giangrande, P., Franchini, G., Fenili, L., & Messi, S. (2025). Analyzing the effect of error estimation on random missing data patterns in mid-term electrical forecasting. Electronics, 14(7), Article 1383. https://doi.org/10.3390/electronics14071383

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts.

Little, R. J. A., & Rubin, D. B. (2019). Statistical analysis with missing data (3rd ed.). John Wiley & Sons.

Moritz, S., & Bartz-Beielstein, T. (2017). imputeTS: Time series missing value imputation in R. The R Journal, 9(1), 207-218. https://doi.org/10.32614/RJ-2017-009

Rothjanawan, K., & Phetjirachotkul, W. (2021). Feature selection methods for imputation missing values of time series data using data mining. Princess of Naradhiwas University Journal, 13(2), 326-341. (in Thai)

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147-177. https://doi.org/10.1037/1082-989X.7.2.147

Sutthison, T., & Pienkhawsook, T. (2024). Hybrid model for forecasting time series data of monthly household electrical distribution units in Thailand. The Journal of King Mongkut's University of Technology North Bangkok, 34(2), 1-18. (in Thai)

Wongoutong, C. (2021). Imputation methods in time series with a trend and a consecutive missing value pattern. Thailand Statistician, 19(4), 866-879.