Exploring large language models as configuration validators: Techniques, challenges, and opportunities
Lian, Xinyu
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/124552
Description
Title
Exploring large language models as configuration validators: Techniques, challenges, and opportunities
Author(s)
Lian, Xinyu
Issue Date
2024-04-29
Director of Research (if dissertation) or Advisor (if thesis)
Xu, Tianyin
Marinov, Darko
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Software Engineering
Machine learning
Abstract
Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive to implement and maintain, and are hard to be comprehensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models, which are hard to generalize. Recent advances in Large Language Models (LLMs) show promise in addressing some of the long-lasting limitations of ML-based configuration validation.
This thesis presents a first analysis on the feasibility and effectiveness of using LLMs for configuration validation. We empirically evaluate LLMs as configuration validators by developing a generic LLM-based configuration validation framework, named Ciri. Ciri employs effective prompt engineering with few-shot learning based on both valid configuration and misconfiguration data. Ciri checks outputs from LLMs when producing results, addressing hallucination and nondeterminism of LLMs. We evaluate Ciri’s validation effectiveness on eight popular LLMs using configuration data of ten widely deployed open-source systems.
Our analysis (1) confirms the potential of using LLMs for configuration validation, e.g., Ciri with Claude-3-Opus detects 45 out of 51 real-world misconfigurations, outperforming recent configuration validation techniques. (2) explores design space of LLM-based validators like Ciri, especially in terms of prompt engineering with few-shot learning and voting. We find that using configuration data as shots can enhance validation effectiveness. (3) reveals open challenges: Ciri struggles with certain types of misconfigurations such as dependency violations and version-specific misconfigurations. It is also biased to the popularity of configuration parameters, causing both false positives and false negatives.
We discuss the promising directions to address these challenges and further improve Ciri. Chain-of-Thoughts (CoT) can mimic the reasoning process of a human expert, which makes the validation more transparent and potentially more accurate. Additionally, LLMs can generate environment-specific scripts to run in the target environment, that can help identify issues like misconfigured paths, unreachable addresses, missing packages, and invalid permissions. We also plan to explore extending Ciri into a multi-agent framework, where Ciri can interact with additional tools such as Ctest and Cdep through agent frameworks.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.