Effects of Missing Data Patterns and Rates on the Accuracy of ARIMA Model for Electricity Consumption Forecasting
Main Article Content
Abstract
This study aims to investigate the effects of missing data patterns and missing rates on the forecasting accuracy of the ARIMA model from a methodological perspective using monthly electricity consumption data in Thailand from 2002 to 2025. A full factorial design was employed to simulate missing data under the Missing Completely at Random (MCAR) and Missing at Random (MAR) mechanisms at missing rates of 10%, 20%, and 30%, with 50 replications conducted for each experimental condition. Three widely used imputation methods in time series forecasting were applied, namely Last Observation Carried Forward (LOCF), Linear interpolation, and Kalman filtering. The results reveal that the imputation method is the most influential factor affecting forecasting accuracy, followed by the missing data rate. Kalman filtering consistently produced the lowest Root Mean Squared Error (RMSE) and demonstrated high stability across experimental conditions, whereas Linear interpolation consistently yielded the highest RMSE values. In addition, the performance of certain methods, particularly LOCF, varied substantially according to the proportion of missing data. These findings suggest that Kalman filtering is a robust and appropriate approach for handling incomplete energy time series data and can effectively support long-term energy consumption planning under data uncertainty.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: Forecasting and control (5th ed.). John Wiley & Sons.
Chhabra, G. (2023). Comparison of imputation methods for univariate time series. International Journal on Recent and Innovation Trends in Computing and Communication, 11(2s), 286-292. https://doi.org/10.17762/ijritcc.v11i2s.6148
De Bézenac, E., Rangapuram, S. S., Benidis, K., Bohlke-Schneider, M., Kurle, R., Stella, L., Hilmkil, H., Januschowski, T., & Gasthaus, J. (2020). Normalizing Kalman filters for multivariate time series analysis. Proceedings of the 34th International Conference on Neural Information Processing Systems (pp. 2995-3007). Curran Associates Inc.
Duarte, O., Duarte, J. E., & Rosero-Garcia, J. (2024). Data imputation in electricity consumption profiles through shape modeling with autoencoders. Mathematics, 12(19), Article 3004. https://doi.org/10.3390/math12193004
Durbin, J., & Koopman, S. J. (2012). Time series analysis by state space methods (2nd ed.). Oxford University Press.
Fung, D. S. (2006). Methods for the estimation of missing values in time series [Master’s thesis, Edith Cowan University]. https://ro.ecu.edu.au/theses/63
Hussain, A., Giangrande, P., Franchini, G., Fenili, L., & Messi, S. (2025). Analyzing the effect of error estimation on random missing data patterns in mid-term electrical forecasting. Electronics, 14(7), Article 1383. https://doi.org/10.3390/electronics14071383
Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts.
Little, R. J. A., & Rubin, D. B. (2019). Statistical analysis with missing data (3rd ed.). John Wiley & Sons.
Moritz, S., & Bartz-Beielstein, T. (2017). imputeTS: Time series missing value imputation in R. The R Journal, 9(1), 207-218. https://doi.org/10.32614/RJ-2017-009
Rothjanawan, K., & Phetjirachotkul, W. (2021). Feature selection methods for imputation missing values of time series data using data mining. Princess of Naradhiwas University Journal, 13(2), 326-341. (in Thai)
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147-177. https://doi.org/10.1037/1082-989X.7.2.147
Sutthison, T., & Pienkhawsook, T. (2024). Hybrid model for forecasting time series data of monthly household electrical distribution units in Thailand. The Journal of King Mongkut's University of Technology North Bangkok, 34(2), 1-18. (in Thai)
Wongoutong, C. (2021). Imputation methods in time series with a trend and a consecutive missing value pattern. Thailand Statistician, 19(4), 866-879.