Aftab Hussain
University of Houston

From Query to Usable Code - An Analysis of Stack Overflow Code Snippets

Di Yang, Aftab Hussain, Cristina Lopes
Mondego Group
Bren School of Information and Computer Sciences
University of California, Irvine

account_balance Supported by National Science Foundation
event Accepted at The 13th International Conference on Mining Software Repositories (MSR 2016), Austin, Texas

construction Skills used: Java

arrow_backReturn to Projects

drawing

highlight 100+ citations on Google Scholar

Besides being useful for software developers, annotated Stack Overflow snippets can potentially serve as the basis for automated tools that provide working code solutions to specific natural language queries. Towards this goal, we investigated the compilability of Stack Overflow code snippets. A total of 3 million code snippets were analyzed across four languages: C#, Java, JavaScript, and Python. Python and JavaScript proved to be the languages for which the most code snippets are usable. Conversely, Java and C# proved to be the languages with the lowest usability rate.

● Compiled 300,000+ StackOverflow Java snippets. Designed and implemented automatic repair heuristics to improve their parse rate from 6.22% to 19.24%.


article Paper - MSR 2016
article Paper - arXiv 2016

Image by stockgiu on Freepik