The impact of measurement scale on classification performance of inductive learning and statistical approaches
Han, Ingoo
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/22935
Description
Title
The impact of measurement scale on classification performance of inductive learning and statistical approaches
Author(s)
Han, Ingoo
Issue Date
1990
Doctoral Committee Chair(s)
Chandler, John S.
Department of Study
Accountancy
Discipline
Accountancy
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Business Administration, Accounting
Artificial Intelligence
Language
eng
Abstract
This thesis is a comparative study of inductive learning and statistical methods. The focus of this study is to investigate the impact of measurement scale of explanatory variables on the relative performance of the statistical method (probit) and the inductive learning method (ID3). In addition, the impact of correlation structure on the classification behavior of the probit method and the ID3 method is examined.
A comparative analysis of the ID3 method and the probit method indicates that the differences in distribution assumption, the relationship between independent and dependent variables, and the modeling basis between probit and ID3 have, to a large extent, originated from the different assumptions on the measurement scale for independent variables between the two methods. The theoretical discussion leads to hypothetical statements that the ID3 performs relatively better with nominal variables and that the probit method performs relatively better with numeric variables.
In the empirical test, simulated data are used to provide generalizable background results. The equality of covariance matrices and the magnitude of correlations are manipulated in addition to the measurement scale. Next, accounting data (bankruptcy prediction) are tested to obtain results more applicable to accounting domain. ANOVA and regression analysis are used to investigate the statistical significance of the impact of the measurement scale, the equality of covariance matrices, and the magnitude of correlations on the classification performance of ID3 and probit.
The main hypothesis, that the relative classification accuracy of the ID3 method to the probit method increases as the proportion of binary variables increases in the classification model, is confirmed by the results from both simulated data and bankruptcy data. The empirical results also show that the relative classification accuracy of the ID3 method to the probit method is higher when the variances are unequal among populations than when the variances are equal among populations.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.