TY - JOUR
T1 - Selecting Source Code Generation Tools Based on Bandit Algorithms
AU - Tsunoda, Masateru
AU - Shima, Ryoto
AU - Tahir, Amjed
AU - Bennin, Kwabena Ebo
AU - Monden, Akito
AU - Toda, Koji
AU - Nakasai, Keitaro
PY - 2024/11/11
Y1 - 2024/11/11
N2 - Background: Code s generation tools such as GitHub Copilot have received attention due to their performance in generating code. Generally, a prior analysis of their performance is needed to select new code-generation tools from a list of candidates. Without such analysis, there is a higher risk of selecting an ineffective tool, which would negatively affect software development productivity. Additionally, conducting prior analysis of new code generation tools is often time-consuming. Aim: To use a new code generation tool without prior analysis but with low risk, we propose to evaluate the new tools during software development (i.e., online optimization). Method: We apply the bandit algorithm (BA) approach to help select the best code suggestion or generation tool among a list of candidates. Developers evaluate whether the result of the tool is correct or not. When code generation and evaluation are repeated, the evaluation results are saved. We utilize the stored evaluation results to select the best tool based on the BA approach. In our preliminary analysis, we evaluated five tools with 164 code-generation cases using BA. Result: BA approach selected ChatGPT as the best tool as the evaluation proceeded, and during the evaluation, the average accuracy by BA approach outperformed the second-best performing tool. Our results reveal the feasibility and effectiveness of BA in assisting the selection of best-performing code suggestion or generation tools.
AB - Background: Code s generation tools such as GitHub Copilot have received attention due to their performance in generating code. Generally, a prior analysis of their performance is needed to select new code-generation tools from a list of candidates. Without such analysis, there is a higher risk of selecting an ineffective tool, which would negatively affect software development productivity. Additionally, conducting prior analysis of new code generation tools is often time-consuming. Aim: To use a new code generation tool without prior analysis but with low risk, we propose to evaluate the new tools during software development (i.e., online optimization). Method: We apply the bandit algorithm (BA) approach to help select the best code suggestion or generation tool among a list of candidates. Developers evaluate whether the result of the tool is correct or not. When code generation and evaluation are repeated, the evaluation results are saved. We utilize the stored evaluation results to select the best tool based on the BA approach. In our preliminary analysis, we evaluated five tools with 164 code-generation cases using BA. Result: BA approach selected ChatGPT as the best tool as the evaluation proceeded, and during the evaluation, the average accuracy by BA approach outperformed the second-best performing tool. Our results reveal the feasibility and effectiveness of BA in assisting the selection of best-performing code suggestion or generation tools.
U2 - 10.1587/transinf.2024IIL0001
DO - 10.1587/transinf.2024IIL0001
M3 - Letter
SN - 0916-8532
JO - IEICE Transactions on Information and Systems
JF - IEICE Transactions on Information and Systems
ER -