Withdraw
Loading…
Toward managing catastrophic AI risks
Mazeika, Mantas
Loading…
Permalink
https://hdl.handle.net/2142/124375
Description
- Title
- Toward managing catastrophic AI risks
- Author(s)
- Mazeika, Mantas
- Issue Date
- 2024-04-24
- Director of Research (if dissertation) or Advisor (if thesis)
- Forsyth, David
- Doctoral Committee Chair(s)
- Forsyth, David
- Committee Member(s)
- Li, Bo
- Lazebnik, Svetlana
- Krueger, David
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- AI safety
- AI risk, robustness
- red teaming
- neural trojans
- trojan detection
- alignment
- model stealing
- Abstract
- Artificial intelligence (AI) has rapidly improved over the past decade, leading to widespread adoption of AI systems and demonstrating the potential for AI to greatly benefit society. However, as with any powerful new technology, AI introduces risks that must be managed to fully realize these benefits. Recent breakthroughs in the generality of AI systems have drawn increased attention to AI risks, including those of a potentially catastrophic nature. To help manage these anticipated risks, we take a defense in depth approach, combining different areas of AI safety research to address different aspects of AI risk. We present research on making AI systems more robust to adversarial influence, monitoring AIs for hidden behavior and trojans, enabling AIs to understand and adhere to human values, and finally addressing systemic problems to enable increased transparency.
- Graduation Semester
- 2024-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2024 Mantas Mazeika
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…