Withdraw
Loading…
Exploring efficiency improvements to distillation of language models for code
Karthikeyan, Ajaykrishna
Loading…
Permalink
https://hdl.handle.net/2142/124446
Description
- Title
- Exploring efficiency improvements to distillation of language models for code
- Author(s)
- Karthikeyan, Ajaykrishna
- Issue Date
- 2024-05-01
- Director of Research (if dissertation) or Advisor (if thesis)
- Zhang, Lingming
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- machine learning
- language models for code
- distillation
- fine-tuning
- efficiency
- subsampling
- Abstract
- Large Language Models (LLMs) trained on code have demonstrated great success in the code domain across various tasks including code generation, repair, test generation, and more. These models are increasingly being adopted in both industry for practical software development and in academia to address challenges in software engineering that previously relied on heuristic-based methods or conventional machine learning techniques. While prompting can be an easy way to deploy these models for certain tasks, fine-tuning them for specific tasks often yields more reliable results. Therefore fine-tuning is widely prevalent in the industry and is still used for several tasks in academia. However, with the increasing size of these models, a major drawback of fine-tuning is the high cost, arising primarily from the extensive GPU compute time required. This compute time is especially large when we consider the full development and debugging phase of fine-tuning these models. Therefore, in this study, we aim to reduce these costs by reducing the time needed for fine-tuning while striving to minimally impact predictive performance. We primarily explore data subsampling techniques that can help reduce the size of the fine-tuning dataset thereby reducing associated costs. We specifically focus on fine-tuning within the context of model distillation. Previous research has utilized functional correctness-based methods for data sub-sampling, which can be effective for standalone programming tasks. However, many real-world coding tasks involve complexities such as setting up environments or integrating external resources, making evaluating functional correctness infeasible. Our work explores alternative sampling-based techniques that rely only on the probabilistic outputs of models instead of code execution. Through extensive empirical evaluation on an MBPP-based dataset, we show the feasibility of reducing fine-tuning time while maintaining performance. More specifically, we demonstrate that our methodology can achieve a 37% reduction in fine-tuning time with negligible impact on predictive performance, and in some instances, up to a 75% reduction without any loss in performance.
- Graduation Semester
- 2024-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2024 Ajaykrishna Karthikeyan
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…