This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/70725
Description
Title
Outliers in a Linear Regression Model
Author(s)
Miyashita, Hiroshi
Issue Date
1982
Department of Study
Economics
Discipline
Economics
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Statistics
Abstract
Over the last several decades the linear regression model has become one of the most widely used tools of the social sciences and the physical sciences. Given the data, the least squares method gives information for statistical inferences. However, the researcher frequently feels that the regression results are not trustworthy because of possible problems with the data. These problems have sometimes been ignored in practice. It is absurd that we include all data without question if some of the data are in error, or they come from a different regime. Those data are called outliers and should be excluded from the sample or at least treated carefully.
Several test statistics for detecting outliers have been developed. However, the tests based on those statistics usually require the assumption of error normality. If the underlying error distribution deviates from the normal, the test is not trustworthy. This is confirmed by a simulation study. Even if the error distribution is normal, it is computationally burdensome and sometimes impossible to locate more than one outlier correctly. Therefore, it is impossible to detect outliers if the error distribution is non-normal or there is a possibility of having more than one outlier. One solution to this problem is a Bayesian approach.
The test of significance has little relevance in the context of a Bayesian approach. We accommodate outliers rather than detect and drop them. Furthermore, we don't have to know how many outliers exist in the sample. By constructing an appropriate prior distribution of having outliers, we can derive a posterior distribution of the regression parameters. The underlying error distribution is not restricted to the normal. Introducing a class of symmetric exponential power distributions which includes the normal as a special case, we can handle the situation in which the error distribution is assumed to be non-normal. Hypothesis testing can be done by constructing a Bayesian confidence interval. Using the interval we can test a null hypothesis in the possible presence of outliers.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.