Withdraw
Loading…
CodeSimilarity: an approach for clustering introductory programming assignments
Osei-Owusu, Jonathan
Loading…
Permalink
https://hdl.handle.net/2142/108644
Description
- Title
- CodeSimilarity: an approach for clustering introductory programming assignments
- Author(s)
- Osei-Owusu, Jonathan
- Issue Date
- 2020-07-24
- Director of Research (if dissertation) or Advisor (if thesis)
- Xie, Tao
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- clustering
- programming education
- testing
- Abstract
- Enrollment in introductory programming (CS1) courses continues to surge and hundreds of CS1 students can produce thousands of submissions for a single problem, all requiring timely feedback and accurate grading. While not exclusive to CS1 courses, instructors of such courses are challenged to provide feedback at scale (e.g., to hundreds of students). Because these students have a diverse range of skills and backgrounds, it is essential to differentiate common strategies and shortcomings of student submissions to a given problem. There is a strong need for clustering submissions by the similarity of their strategies for enabling instructors to provide customized feedback to students. To fill this need, in this thesis, we present the CodeSimilarity approach, which first automatically generates test data for correct student submissions and then uses semantic program features (i.e., path conditions) to cluster correct student submissions by their strategies. We define the strategy employed by a student submission as the way that the problem space is partitioned into sub-spaces and how the problem is uniquely addressed within each sub-space. In particular, CodeSimilarity leverages automated test generation based on symbolic execution to determine the path conditions for a given submission; comparing each submission’s path conditions allows to establish behavioral equivalence relationships with respect to the strategies employed by these submissions. We evaluate CodeSimilarity on four datasets to assess the effectiveness of our approach. The evaluation results show that by using semantic program features (i.e., path conditions), CodeSimilarity can effectively cluster submissions that employ the same strategy.
- Graduation Semester
- 2020-08
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/108644
- Copyright and License Information
- Copyright 2020 Jonathan Osei-Owusu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…