Withdraw
Loading…
Modeling the winning seed distribution of the NCAA basketball tournament
Khatibi, Arash
Loading…
Permalink
https://hdl.handle.net/2142/95338
Description
- Title
- Modeling the winning seed distribution of the NCAA basketball tournament
- Author(s)
- Khatibi, Arash
- Issue Date
- 2016-11-23
- Director of Research (if dissertation) or Advisor (if thesis)
- Jacobson, Sheldon H.
- Doctoral Committee Chair(s)
- Jacobson, Sheldon H.
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Bracket Challenge
- NCAA Tournament
- Abstract
- The National Collegiate Athletic Association's (NCAA) men's division I college basketball tournament is an annual competition that draws widespread attention in the United States. Estimating the outcome of each game is a popular activity undertaken by numerous websites, fans, and more recently, academic researchers. There has been a surge of interest in proposing mathematical methods to model the tournament's results and pick the winners of future games. This thesis analyzes the results of the NCAA basketball tournament since 1985 and proposes several models to capture the winning seed distribution in each round. The Exponential Model estimates the winning probability of each team by modeling the time between a team's successive winnings in a round as an exponential random variable. The Exponential Model estimates a zero probability for events that have not occurred in the training data set. The Markov Model solves this limitation by defining a Markov chain that incorporates each team's winnings in prior rounds to estimate its winning probability. Results of these two models are validated using a chi-squared goodness of fit test. The Power Model, which is an intelligent tool for generating brackets of winners, quantifies the relative strength of each match-up in a round as a power function of the teams' seed numbers, with the exponent estimated using the historical results. The main problem of the Power Model is the data complications that are generally caused by the small size of the training data set, especially in later rounds. The Position and Upset Models solve this problem by representing the tournament's games as a binary sequence and estimating the outcome of each game based on the teams' performance in the similar game. While generating a bracket in a forward direction from the first to the last round propagates the incorrect picks through the tournament, correctly picking the winners in later rounds automatically fills the bracket for several games in earlier rounds. This motivates developing bidirectional models that pick the winners based on a combination of models in forward and backward directions. The Power, Position, Upset, and bidirectional models are assessed based on the aggregate performance of millions of brackets for the five most recent tournaments (2012-2016). The proposed models allow one to estimate the likelihoods of different seed combinations by applying the estimated winning seed distributions, which accurately summarize the seeds' aggregate performance and provide a deeper understanding of the uncertainty in the games' outcomes.
- Graduation Semester
- 2016-12
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/95338
- Copyright and License Information
- Copyright 2016 Arash Khatibi
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…