Automated classification of unstructured bilingual software bug reports: An industrial case study research

Ömer Köksal*, Bedir Tekinerdogan*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

8 Citations (Scopus)

Abstract

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.

Original languageEnglish
Article number338
JournalApplied Sciences (Switzerland)
Volume12
Issue number1
Early online date30 Dec 2021
DOIs
Publication statusPublished - Jan 2022

Keywords

  • Machine learning
  • Natural language processing
  • Software bug classification
  • Text categorization
  • Text mining

Fingerprint

Dive into the research topics of 'Automated classification of unstructured bilingual software bug reports: An industrial case study research'. Together they form a unique fingerprint.

Cite this