Exploring efficiency improvements to distillation of language models for code

Karthikeyan, Ajaykrishna

Exploring efficiency improvements to distillation of language models for code

Karthikeyan, Ajaykrishna

Permalink

https://hdl.handle.net/2142/124446

Description

Title

Exploring efficiency improvements to distillation of language models for code

Author(s)

Karthikeyan, Ajaykrishna

Issue Date

2024-05-01

Director of Research (if dissertation) or Advisor (if thesis)

Zhang, Lingming

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

machine learning
language models for code
distillation
fine-tuning
efficiency
subsampling

Abstract

Large Language Models (LLMs) trained on code have demonstrated great success in the code domain across various tasks including code generation, repair, test generation, and more. These models are increasingly being adopted in both industry for practical software development and in academia to address challenges in software engineering that previously relied on heuristic-based methods or conventional machine learning techniques. While prompting can be an easy way to deploy these models for certain tasks, fine-tuning them for specific tasks often yields more reliable results. Therefore fine-tuning is widely prevalent in the industry and is still used for several tasks in academia. However, with the increasing size of these models, a major drawback of fine-tuning is the high cost, arising primarily from the extensive GPU compute time required. This compute time is especially large when we consider the full development and debugging phase of fine-tuning these models. Therefore, in this study, we aim to reduce these costs by reducing the time needed for fine-tuning while striving to minimally impact predictive performance. We primarily explore data subsampling techniques that can help reduce the size of the fine-tuning dataset thereby reducing associated costs. We specifically focus on fine-tuning within the context of model distillation. Previous research has utilized functional correctness-based methods for data sub-sampling, which can be effective for standalone programming tasks. However, many real-world coding tasks involve complexities such as setting up environments or integrating external resources, making evaluating functional correctness infeasible. Our work explores alternative sampling-based techniques that rely only on the probabilistic outputs of models instead of code execution. Through extensive empirical evaluation on an MBPP-based dataset, we show the feasibility of reducing fine-tuning time while maintaining performance. More specifically, we demonstrate that our methodology can achieve a 37% reduction in fine-tuning time with negligible impact on predictive performance, and in some instances, up to a 75% reduction without any loss in performance.

Graduation Semester

2024-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/124446

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Exploring efficiency improvements to distillation of language models for code

Karthikeyan, Ajaykrishna

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In