Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Bio-inspired credit risk analysis
Nội dung xem thử
Mô tả chi tiết
Bio-Inspired Credit Risk Analysis
Computational Intelligence with Support Vector Machines
Lean Yu Shouyang Wang ● ● Kin Keung Lai
Ligang Zhou
Analysis
Vector Machines
Bio-Inspired Credit Risk
Computational Intelligence with Support
●
© 2008 Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
Cover design: WMXDesign GmbH, Heidelberg, Germany
Printed on acid-free paper
9 8 7 6 5 4 3 2 1
springer.com
Prof. Dr. Shouyang Wang
e-ISBN 978-3-540-77803-5
City University of Hong Kong
Prof. Dr. Kin Keung Lai
Dr. Ligang Zhou
Library of Congress Control Number: 2008925546
in its current version, and permissions for use must always be obtained from Springer-Verlag. Violations
are liable for prosecution under the German Copyright Law.
Dr. Lean Yu
Institute of Systems Science
83 Tat Chee Avenue, Kowloon
Department of Management Sciences
City University of Hong Kong
83 Tat Chee Avenue, Kowloon
Department of Management Sciences
Chinese Academy of Sciences
100190, P.R. China
Institute of Systems Science
Chinese Academy of Sciences
100190, P.R. China
Haidian District Beijing, Haidian District Beijing,
and System Science and System Science
Academy of Mathematics Academy of Mathematics
55 Zhongguancun East Road, 55 Zhongguancun East Road,
Hong Kong, P.R. China Hong Kong, P.R. China
ISBN 978-3-540-77802-8
Preface
Credit risk evaluation is one of the most important topics in the field of financial risk management. Due to recent financial crises and regulatory
concern of Basel II, credit risk analysis and assessment have been a major
focus of financial and banking industry. Especially for many creditgranting institutions such as commercial banks and credit companies, the
ability to discriminate good customers from bad ones is crucial to success
of their business. The need for reliable quantitative models that predict defaults accurately is imperative so that the interested parties can take either
preventive or corrective actions. Hence, credit risk modeling and analysis
become very important for sustainability and profit of enterprises. Furthermore, an accurate prediction of credit risk could be transformed into a
more efficient use of economic capital in business. Therefore, credit risk
modeling and analysis have become an important issue in both academic
and industrial communities.
In this monograph, the authors try to integrate recent emerging support
vector machines (SVM) and other computational intelligence techniques
that replicate the principles of bio-inspired information processing for
credit risk modeling and analysis. Selecting SVM for credit risk modeling
analysis is due to its unique features and powerful pattern recognition capability of SVM. Unlike most of the traditional statistical models, SVM is
a class of data-driven, self-adaptive, and nonlinear methods that do not require specific assumptions (e.g., normal distribution in statistics) on the
underlying data generating process. This feature is particularly appealing
for practical business situations where data are abundant or easily available, even though the theoretical model or the underlying relationship is
unknown. Secondly, SVM performs a nonlinear mapping from an original
input space into a high dimensional feature space, in which it can construct
a linear discriminant function to replace the nonlinear function in the
original low dimensional input space. This characteristic also solves the
dimension disaster problem because its computational complexity is not
dependent on the sample dimension. Thirdly, SVM implements structural
risk minimization strategy instead of empirical risk minimization strategy
in artificial neural networks (ANN) to separate hyperplanes by using margin maximization principle, therefore possessing good generalization abil-
VI Preface
ity. This feature directly helps SVM escape local minima, which are often
occurred in the training of ANNs. Furthermore, SVM has been successfully applied to a wide range of practical problems in almost all areas of
business, industry and science. In some sense, SVM has some distinct advantages in comparison with the traditional statistical techniques and ANN
models when analyzing credit risk.
The main purpose of this monograph is to develop some new models
and techniques to evaluate credit risk and meantime to report some recent
progress in credit risk modeling via SVM and other computational intelligence techniques, as well as to present a comprehensive survey of the past
researches in the area of credit risk modeling for academic researchers and
business practitioners. Therefore, some most important advancements in
the field of credit risk modeling with SVM are presented. The book contains 4 parts with a total of 11 chapters which are briefly described below.
Part I presents an analytical survey on computational intelligence in
credit risk modeling and analysis. Particularly, this survey discusses the
factors of affecting credit risk classification capability with SVM. Through
a literature review and analysis, some important implications and future research directions are pointed out. According to the results and implications
of this survey, the sequel chapters will discuss these new research directions and provide the corresponding solutions.
In terms of non-optimal parameter selection problem in SVM learning
algorithm shown in the existing studies, Part II mainly develops two unitary SVM models with optimal parameter selection to evaluate credit risk.
In the first unitary SVM model presented in Chapter 2, a design of experiment (DOE) method is used to determine the optimal parameters of the
SVM model and simultaneously a nearest point algorithm (NPA) is used to
obtain quickly the solutions of the SVM model with optimal parameters. In
the second unitary SVM model given in Chapter 3, its parameters are determined by a direct search (DS) algorithm. Meantime, some other parameter selection methods, such as genetic algorithm (GA), grid search
(GS) algorithm, and design of experiment (DOE), are also conducted to
compare the performance of different parameter selection methods when
the proposed unitary SVM models with optimal parameter are applied to
credit risk evaluation and analysis.
In accordance with the previous analysis in the survey, the hybrid and
ensemble models usually achieve better classification performance than
the unitary SVM models. For this purpose, Part III and Part IV present four
hybrid models and four ensemble models, respectively.
In the first hybrid model of Part III, rough set theory (RST) and SVM
are hybridized into a synergetic model for credit risk classification and
analysis. Different from the existing hybrid approach integrating RST and
Preface VII
SVM, SVM is used for feature selection and then RST is used to generate
classification rules for credit risk evaluation in the proposed hybrid model.
In terms of computational complexity problem of SVM, the second hybrid
model incorporates fuzzy set theory (FST) and least squares SVM
(LSSVM) to create a least squares fuzzy SVM (LS-FSVM) for credit risk
assessment. Subsequently, a bilateral-weighted fuzzy SVM (FSVM) model
hybridizing SVM and FST is proposed for credit risk assessment in the
third hybrid model. In the new fuzzy SVM model, we treat every sample
as both positive and negative classes, but with different memberships,
which is generated by fuzzy set theory. This model is applied to three typical credit datasets and obtains good classification performance. Finally, an
evolving LSSVM model based on genetic algorithm (GA) is proposed for
credit risk analysis and evaluation. This model consists of two main evolutions: input feature evolution and parameter evolution. On one hand, a
standard GA is first used to search the possible combination of input features. The input features selected with GA are used to train LSSVM. On
the other hand, another GA is used to optimize parameter of LSSVM using
the feature evolved LSSVM. For the purpose of verification, three different credit datasets are used and accordingly satisfied classification results
are reported.
In the four ensemble models of Part IV, the first model presents a multistage ensemble framework to formulate an SVM ensemble learning approach for credit risk evaluation. The second ensemble model introduces a
metalearning strategy to construct a SVM-based metamodeling ensemble
method. In a sense, the proposed SVM ensemble model is actually an
SVM metamodel. In the third ensemble model, an evolutionary programming (EP) based knowledge ensemble model is proposed for credit risk
evaluation and analysis. In the last chapter of Part IV, a novel intelligentagent-based multicriteria fuzzy GDM model is proposed as a multicriteria
decision-making (MCDM) tool to support credit risk assessment. Different
from the commonly used “one-member-one-vote” or the “majority-votingrule” ensemble models, the novel fuzzy GDM model first uses several intelligent agents to evaluate the customers over a number of criteria, then
the evaluation results are fuzzified into some fuzzy judgments, and finally
these fuzzy judgments are aggregated and defuzzified into a group consensus as a final group decision measurement.
We would like to thank many colleagues and friends for their help and
support in preparing this monograph. First, we thank Professor Y. Nakamori of Japan Advanced Institute of Science and Technology, Professor
Yongqiao Wang of Zhejiang University of Commerce and Professor Wei
Huang of Huazhong University of Science and Technology for their contributions to the studies in this monograph. Three chapters are based on the
VIII Preface
results that we achieved jointly with them. We would like to thank several
other scientists for their helpful suggestions and valuable comments on our
research in this area, among them are Professor Wuyi Yue of Konan University in Japan, Professor M. Makowski of International Institute of Applied Systems Analysis in Austria, Professor Serge Hayward of Groups
ESC Dijon Bourgogne in France and Professor Heping Pan of International Institute for Financial Prediction in Australia.
Finally, we would like to thank the National Natural Science Foundation
of China (NSFC), the Knowledge Innovation Program of Chinese Academy of Sciences (CAS), the Academy of Mathematics and Systems Science (AMSS) of CAS, the Hong Kong Research Granting Committee
(RGC), the NSFC/RGC Joint Research Scheme (No. N_CityU110/07) and
City University of Hong Kong for their financial support to our research in
this promising area.
Lean YU
Institute of Systems Science
Academy of Mathematics and Systems Science
Chinese Academy of Sciences
Beijing, 100190, China
Email: [email protected]
Shouyang WANG
Institute of Systems Science
Academy of Mathematics and Systems Science
Chinese Academy of Sciences
Beijing, 100190, China
Email: [email protected]
Kin Keung LAI
Department of Management Sciences
City University of Hong Kong
83 Tat Chee Avenue, Kowloon, Hong Kong
Email: [email protected]
Ligang ZHOU
Department of Management Sciences
City University of Hong Kong
83 Tat Chee Avenue, Kowloon, Hong Kong
Email: [email protected]
December, 2007
List of Figures
Fig. 1.1 Distribution of articles by year ............................................................7
Fig. 1.2 Structure of one typical multilayer perceptron................................13
Fig. 1.3 Separating hyperplane for two separable classes with maximal
margin..................................................................................................15
Fig. 1.4 Performance comparison of different models on different credit
datasets ................................................................................................19
Fig. 2.1 An illustration for DOE parameter search with two iterations ......34
Fig. 3.1 An illustrative example for grid search.............................................49
Fig. 3.2 Performance of each parameter pair in initial range.......................52
Fig. 3.3 Sensitivity analysis of OA on initial space setting ............................52
Fig. 3.4 Sensitivity analysis of Se on initial space setting ..............................53
Fig. 3.5 Sensitivity analysis of Sp on initial space setting..............................53
Fig. 4.1 General framework of the hybrid intelligent mining system ..........63
Fig. 7.1 General framework of the evolving LSSVM learning paradigm..112
Fig. 7.2 Input feature selection with GA for LSSVM ..................................114
Fig. 7.3 Performance of feature evolved LSSVM on three credit datasets 125
Fig. 8.1 General process of multistage SVM ensemble learning model .....141
Fig. 8.2 Bagging sampling algorithm ............................................................142
Fig. 9.1 A generic metalearning process .......................................................162
Fig. 9.2 An extended metalearning process ..................................................164
Fig. 9.3 SVM-based metamodeling process..................................................166
Fig. 9.4 Graphical illustration for SVM-based metalearning process........172
Fig. 9.5 Performance comparisons with the different numbers of base models .......................................................................................................176
Fig. 11.1 An illustrative sketch of the intelligent-agent-based multicriteria
GDM model........................................................................................201
Fig. 11.2 ROC curve and AUC for two different models ..............................209
X List of Figures
Fig. 11.3 A group decision table for credit scoring........................................ 211
Fig. 11.4 A graphic comparison for different models in German dataset ... 218
Fig. 11.5 Improving the global performance of a classifier .......................... 221
List of Tables
Table 1.1 Books about credit risk modeling and analysis .................................6
Table 1.2 Comparisons of different credit risk models based on different criteria.....................................................................................................18
Table 1.3 Accuracy comparison of different quantitative models..................20
Table 1.4 SVM models and their factors ..........................................................22
Table 2.1 Performance comparison of different methods on German credit
dataset.................................................................................................37
Table 3.1 Independent variables in credit evaluation model for German
dataset.................................................................................................48
Table 3.2 Parameter settings for DS, GA, GS and DOE.................................50
Table 3.3 Evaluation results of four methods for LSSVM on German dataset ........................................................................................................51
Table 3.4 Evaluation results of four methods for LSSVM on Australian dataset ........................................................................................................51
Table 3.5 Performance comparisons of different classifiers ...........................54
Table 4.1 Comparisons of different methods on corporation credit dataset .70
Table 4.2 Comparisons of different methods on consumer credit dataset.....71
Table 5.1 Variables of the experimental dataset..............................................81
Table 5.2 Credit risk evaluation results by LS-FSVM ....................................83
Table 5.3 Performance comparisons of different classifiers ...........................83
Table 6.1 Empirical results on the dataset 1 ....................................................97
Table 6.2 Empirical results on the dataset 2 ....................................................99
Table 6.3 Empirical results on the dataset 3 ..................................................101
Table 7.1 Selected key features by GA-based feature selection procedure .124
Table 7.2 Optimal solutions of different parameters for LSSVM................127
Table 7.3 Computational performance comparisons using different parameter search methods for three credit datasets .................................129
Table 7.4 Performance comparisons of different models..............................130
XII List of Tables
Table 8.1 Consumer credit evaluation results with different methods ...... 150
Table 8.2 Corporation credit evaluation results with different methods... 153
Table 9.1 Performance comparison with different evaluation approaches175
Table 10.1 Identification results of MDA and logit regression models........ 190
Table 10.2 Identification results of BPNN models with different designs ... 190
Table 10.3 Identification performance of SVM with different parameters. 192
Table 10.4 Identification performance of different knowledge ensemble models .................................................................................................... 193
Table 10.5 Identification performance comparisons with different models 194
Table 10.6 McNemar values for performance pairwise comparisons.......... 194
Table 11.1 Performance comparisons with different models for England
dataset............................................................................................. 212
Table 11.2 Performance comparisons with different models for Japanese
dataset............................................................................................. 216
Table 11.3 Performance comparisons with different models for German
dataset............................................................................................. 218
Table 11.4 McNemar’s test for pairwise performance comparison............. 219
Table of Contents
Part I
Credit Risk Analysis with Computational Intelligence:
An Analytical Survey................................................................................. 1
1 Credit Risk Analysis with Computational Intelligence: A Review..... 3
1.1 Introduction........................................................................................ 3
1.2 Literature Collection.......................................................................... 5
1.3 Literature Investigation and Analysis ................................................ 7
1.3.1 What is Credit Risk Evaluation Problem? .................................. 8
1.3.2 Typical Techniques for Credit Risk Analysis............................. 8
1.3.3 Comparisons of Models ............................................................ 17
1.4 Implications on Valuable Research Topics...................................... 23
1.5 Conclusions...................................................................................... 24
Part II
Unitary SVM Models with Optimal Parameter Selection for Credit
Risk Evaluation........................................................................................ 25
2 Credit Risk Assessment Using a Nearest-Point-Algorithm-based
SVM with Design of Experiment for Parameter Selection.............. 27
2.1 Introduction...................................................................................... 27
2.2 SVM with Nearest Point Algorithm ................................................ 29
2.3 DOE-based Parameter Selection for SVM with NPA ..................... 33
2.4 Experimental Analysis..................................................................... 35
2.5 Conclusions...................................................................................... 38
3 Credit Risk Evaluation Using SVM with Direct Search for Parameter Selection .......................................................................................... 41
3.1 Introduction...................................................................................... 41
3.2 Methodology Description ................................................................ 43
3.2.1 Brief Review of LSSVM .......................................................... 43
3.2.2 Direct Search for Parameter Selection...................................... 45
3.3 Experimental Study.......................................................................... 47
3.3.1 Research Data ........................................................................... 47
XIV Table of Contents
3.3.2 Parameter Selection with Genetic Algorithm ........................... 48
3.3.3 Parameters Selection with Grid Search..................................... 49
3.3.4 Experimental Results ................................................................ 50
3.4 Conclusions...................................................................................... 54
Part III
Hybridizing SVM and Other Computational Intelligent Techniques
for Credit Risk Analysis.......................................................................... 57
4 Hybridizing Rough Sets and SVM for Credit Risk Evaluation........ 59
4.1 Introduction...................................................................................... 59
4.2 Preliminaries of Rough Sets and SVM ............................................ 61
4.2.1 Basic Concepts of Rough Sets.................................................. 61
4.2.2 Basic Ideas of Support Vector Machines.................................. 62
4.3 Proposed Hybrid Intelligent Mining System ................................... 63
4.3.1 General Framework of Hybrid Intelligent Mining System....... 63
4.3.2 2D-Reductions by Rough Sets.................................................. 64
4.3.3 Feature Selection by SVM........................................................ 65
4.3.4 Rule Generation by Rough Sets................................................ 66
4.3.5 General Procedure of the Hybrid Intelligent Mining System ... 67
4.4 Experiment Study ............................................................................ 68
4.4.1 Corporation Credit Dataset ....................................................... 69
4.4.2 Consumer Credit Dataset .......................................................... 70
4.5 Concluding Remarks........................................................................ 72
5 A Least Squares Fuzzy SVM Approach to Credit Risk Assessment 73
5.1 Introduction...................................................................................... 73
5.2 Least Squares Fuzzy SVM............................................................... 74
5.2.1 SVM.......................................................................................... 74
5.2.2 FSVM ....................................................................................... 77
5.2.3 Least Squares FSVM ................................................................ 79
5.3 Experiment Analysis........................................................................ 81
5.4 Conclusions...................................................................................... 84
6 Evaluating Credit Risk with a Bilateral-Weighted Fuzzy SVM
Model.................................................................................................... 85
6.1 Introduction...................................................................................... 85
6.2 Formulation of the Bilateral-Weighted Fuzzy SVM Model ............ 89
6.2.1 Bilateral-Weighting Errors ....................................................... 89
6.2.2 Formulation Process of the Bilateral-weighted fuzzy SVM ..... 91
6.2.3 Generating Membership ........................................................... 93
6.3 Empirical Analysis........................................................................... 95
Table of Contents XV
6.3.1 Dataset 1: UK Case................................................................... 96
6.3.2 Dataset 2: Japanese Case .......................................................... 98
6.3.3 Dataset 3: England Case ......................................................... 100
6.4 Conclusions.................................................................................... 102
7 Evolving Least Squares SVM for Credit Risk Analysis .................. 105
7.1 Introduction.................................................................................... 105
7.2 SVM and LSSVM.......................................................................... 108
7.3 Evolving LSSVM Learning Paradigm........................................... 111
7.3.1 General Framework of Evolving LSSVM Learning Method . 111
7.3.2 GA-based Input Features Evolution........................................ 113
7.3.3 GA-based Parameters Evolution............................................. 117
7.4 Research Data and Comparable Models........................................ 119
7.4.1 Research Data ......................................................................... 119
7.4.2 Overview of Other Comparable Classification Models.......... 121
7.5 Experimental Results ..................................................................... 123
7.5.1 Empirical Analysis of GA-based Input Features Evolution.... 123
7.5.2 Empirical Analysis of GA-based Parameters Optimization ... 126
7.5.3 Comparisons with Other Classification Models ..................... 129
7.6 Conclusions.................................................................................... 131
Part IV
SVM Ensemble Learning for Credit Risk Analysis............................ 133
8 Credit Risk Evaluation Using a Multistage SVM Ensemble Learning
Approach ............................................................................................. 135
8.1 Introduction.................................................................................... 135
8.2 Previous Studies............................................................................. 138
8.3 Formulation of SVM Ensemble Learning Paradigm ..................... 140
8.3.1 Partitioning Original Data Set................................................. 140
8.3.2 Creating Diverse Neural Network Classifiers......................... 142
8.3.3 SVM Learning and Confidence Value Generation................. 143
8.3.4 Selecting Appropriate Ensemble Members ............................ 144
8.3.5 Reliability Value Transformation ........................................... 146
8.3.6 Integrating Multiple Classifiers into an Ensemble Output...... 146
8.4 Empirical Analysis......................................................................... 148
8.4.1 Consumer Credit Risk Assessment......................................... 149
8.4.2 Corporation Credit Risk Assessment ...................................... 151
8.5 Conclusions.................................................................................... 154
9 Credit Risk Analysis with a SVM-based Metamodeling Ensemble
Approach ............................................................................................. 157