Bug reports identification using multiclassification method

Main Article Content

Jantima Polpinij
Khanista Namee
Bancha Luaphol

Abstract

Whenever software defects (or bugs) are detected, they must be fixed immediately to allow the software to perform properly. The classification task for bug reports includes not only binary classification but also multiclassification. Therefore, multiclassification for bug reports was chosen as the challenge in this study. The proposed method aimed to classify bug reports into three classes, namely real-bug, enhancement, and task. The method began with bug report pre-processing, and then the vector of bug reports was used to develop the multiclassifier models. Eight machine learning algorithms namely multinomial naïve Bayes, logistic regression, random forest, support vector machines, k-nearest neighbor, extreme gradient boosting, neural networks and decision trees were compared. Finally, the classifier was chosen as the best model for the proposed method, and compared with the baseline. The Matthews correlation coefficient, area under the curve, F1 and accuracy scores of the best classifier from the proposed method showed improvement from the baseline at 4.09%, 2.71%, 1.83% and 1.69%, respectively.

Downloads

Download data is not yet available.

Article Details

How to Cite
Polpinij, J., Namee, K., & Luaphol, B. (2022). Bug reports identification using multiclassification method. Science, Engineering and Health Studies, 16, 22020009. https://doi.org/10.14456/sehs.2022.46
Section
Physical sciences

References

Antoniol, G., Ayari, K., Di Penta, M., Khomh, F., and Guéhéneuc, Y. G. (2008). Is it a bug or an enhancement? a text-based approach to classify change requests. In Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, pp. 304-318. Ontario, Canada.

Bhattacharya, P., and Neamtiu, I. (2011). Bug-fix time prediction models: can we do better? In Proceedings of the 8th Working Conference on Mining Software Repositories, pp. 207-210. Honolulu HI, USA.

Chaturvedi, K., and Singh, V. (2012). Determining bug severity using machine learning techniques. In Proceedings of the 2012 CSI Sixth International Conference on Software Engineering, pp. 1-6. Indore, India.

Chen, K., Zhang, Z., Long, J., and Zhang, H. (2016). Turning from tf-idf to tf-igm for term weighting in text classification. Expert Systems with Applications, 66, 245-260.

Firefox. (2016). Bug types. [Online URL: https://firefox-source-docs.mozilla.org/bug-mgmt/guides/bug-types.html] accessed on October 25, 2021.

Herzig, K., Just, S., and Zeller, A. (2013). It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In Proceedings of the 35th International Conference on Software Engineering, pp. 392-401. San Francisco, CA, USA.

Jalbert, N., and Weimer, W. (2008). Automated duplicate detection for bug tracking systems. In Proceedings of the 2008 IEEE International Conference on Dependable Systems and Networks with FTCS and DCC, pp. 52-61. Anchorage, AK, USA.

Kaewnoo, P., and Senivongse, T. (2019). Identification of software problem report types using multiclass classification. In Proceedings of the 2019 3rd International Conference on Software and e-Business, pp. 104-109. Tokyo, Japan.

Kibriya, A. M., Frank, E., Pfahringer, B., and Holmes, G. (2004). Multinomial naive Bayes for text categorization revisited. In Proceedings of the Australasian Joint Conference on Artificial Intelligence, pp. 488-499. Cairns, Australia.

Kukkar, A., Mohana, R., Nayyar, A., Kim, J., Kang, B. G., and Chilamkurti, N. (2019). A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors, 19(13), 2964.

Kumar, R., and Singla, S. (2021). Multiclass software bug severity classification using decision tree, naive band bagging. Turkish Journal of Computer and Mathematics Education, 12(2), 1859-1865.

Limsettho, N., Hata, H., Monden, A., and Matsumoto, K. (2014). Automatic unsupervised bug report categorization. In Proceedings of the 6th International Workshop on Empirical Software Engineering in Practice, pp. 7-12. Osaka, Japan.

Limsettho, N., Hata, H., and Monden, A. (2016). Unsupervised bug report categorization using clustering and labeling algorithm. International Journal of Software Engineering and Knowledge Engineering, 26(07), 1027-1053.

Luaphol, B., Polpinij, J., and Kaenampornpan, M. (2021). Mining bug report repositories to identify significant information for software bug fixing. Applied Science and Engineering Progress, 15(3), 1-14.

Menzies, T., and Marcus, A. (2008). Automated severity assessment of software defect reports. In Proceedings of the 2008 IEEE International Conference on Software Maintenance, pp. 346-355. Beijing, China.

Mozilla. (2015). Bugzilla field descriptions. [Online URL: https://wiki.mozilla.org/BMO/UserGuide/BugFields?fbclid=IwAR2OZBRIwIn-wereb2r6C4-KhZwYtD1lPhnW0kQZeEJyrk4P3_fqHzXcNBw#bug_type] accessed on October 25, 2021.

Pandey, N., Hudait, A., Sanyal, D. K., and Sen, A. (2017). Automated classification of issue reports from a software issue tracker. In Proceedings of the Progress in Intelligent Computing Techniques: Theory, Practice, and Applications, pp. 423-430. Singapore.

Pingclasai, N., Hata, H., and Matsumoto, K. (2013). Classifying bug reports to bugs and other requests using topic modeling. In Proceedings of the 20th Asia-Pacific Software Engineering Conference, pp. 13-18. Bangkok, Thailand.

Polpinij, J. (2021). A method of non‑bug report identification from bug report repository. Artificial Life and Robotics, 26, 318-328.

Ramay, W. Y., Umer, Q., Yin, X. C., Zhu, C., and Illahi, I. (2019). Deep neural network-based severity prediction of bug reports. IEEE Access, 7, 46846-46857.

Terdchanakul, P., Hata, H., Phannachitta, P., and Matsumoto, K. (2017). Bug or not? Bug report classification using N-gram IDF. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution, pp. 534-538. Shanghai, China.