Bio-inspired credit risk analysis

Bio-Inspired Credit Risk Analysis

Computational Intelligence with Support Vector Machines

Lean Yu Shouyang Wang ● ● Kin Keung Lai

Ligang Zhou

Analysis

Vector Machines

Bio-Inspired Credit Risk

Computational Intelligence with Support

●

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is

concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,

reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication

or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,

The use of general descriptive names, registered names, trademarks, etc. in this publication does not

imply, even in the absence of a specific statement, that such names are exempt from the relevant protective

laws and regulations and therefore free for general use.

Cover design: WMXDesign GmbH, Heidelberg, Germany

Printed on acid-free paper

9 8 7 6 5 4 3 2 1

springer.com

Prof. Dr. Shouyang Wang

e-ISBN 978-3-540-77803-5

City University of Hong Kong

Prof. Dr. Kin Keung Lai

[email protected]

Dr. Ligang Zhou

Library of Congress Control Number: 2008925546

in its current version, and permissions for use must always be obtained from Springer-Verlag. Violations

are liable for prosecution under the German Copyright Law.

Dr. Lean Yu

Institute of Systems Science

[email protected]

83 Tat Chee Avenue, Kowloon

Department of Management Sciences

City University of Hong Kong

83 Tat Chee Avenue, Kowloon

Department of Management Sciences

[email protected]

Chinese Academy of Sciences

100190, P.R. China

Institute of Systems Science

Chinese Academy of Sciences

100190, P.R. China

Haidian District Beijing, Haidian District Beijing,

and System Science and System Science

Academy of Mathematics Academy of Mathematics

55 Zhongguancun East Road, 55 Zhongguancun East Road,

Hong Kong, P.R. China Hong Kong, P.R. China

ISBN 978-3-540-77802-8

Preface

Credit risk evaluation is one of the most important topics in the field of financial risk management. Due to recent financial crises and regulatory

concern of Basel II, credit risk analysis and assessment have been a major

focus of financial and banking industry. Especially for many creditgranting institutions such as commercial banks and credit companies, the

ability to discriminate good customers from bad ones is crucial to success

of their business. The need for reliable quantitative models that predict defaults accurately is imperative so that the interested parties can take either

preventive or corrective actions. Hence, credit risk modeling and analysis

become very important for sustainability and profit of enterprises. Furthermore, an accurate prediction of credit risk could be transformed into a

more efficient use of economic capital in business. Therefore, credit risk

modeling and analysis have become an important issue in both academic

and industrial communities.

In this monograph, the authors try to integrate recent emerging support

vector machines (SVM) and other computational intelligence techniques

that replicate the principles of bio-inspired information processing for

credit risk modeling and analysis. Selecting SVM for credit risk modeling

analysis is due to its unique features and powerful pattern recognition capability of SVM. Unlike most of the traditional statistical models, SVM is

a class of data-driven, self-adaptive, and nonlinear methods that do not require specific assumptions (e.g., normal distribution in statistics) on the

underlying data generating process. This feature is particularly appealing

for practical business situations where data are abundant or easily available, even though the theoretical model or the underlying relationship is

unknown. Secondly, SVM performs a nonlinear mapping from an original

input space into a high dimensional feature space, in which it can construct

a linear discriminant function to replace the nonlinear function in the

original low dimensional input space. This characteristic also solves the

dimension disaster problem because its computational complexity is not

dependent on the sample dimension. Thirdly, SVM implements structural

risk minimization strategy instead of empirical risk minimization strategy

in artificial neural networks (ANN) to separate hyperplanes by using margin maximization principle, therefore possessing good generalization abil-

VI Preface

ity. This feature directly helps SVM escape local minima, which are often

occurred in the training of ANNs. Furthermore, SVM has been successfully applied to a wide range of practical problems in almost all areas of

business, industry and science. In some sense, SVM has some distinct advantages in comparison with the traditional statistical techniques and ANN

models when analyzing credit risk.

The main purpose of this monograph is to develop some new models

and techniques to evaluate credit risk and meantime to report some recent

progress in credit risk modeling via SVM and other computational intelligence techniques, as well as to present a comprehensive survey of the past

researches in the area of credit risk modeling for academic researchers and

business practitioners. Therefore, some most important advancements in

the field of credit risk modeling with SVM are presented. The book contains 4 parts with a total of 11 chapters which are briefly described below.

Part I presents an analytical survey on computational intelligence in

credit risk modeling and analysis. Particularly, this survey discusses the

factors of affecting credit risk classification capability with SVM. Through

a literature review and analysis, some important implications and future research directions are pointed out. According to the results and implications

of this survey, the sequel chapters will discuss these new research directions and provide the corresponding solutions.

In terms of non-optimal parameter selection problem in SVM learning

algorithm shown in the existing studies, Part II mainly develops two unitary SVM models with optimal parameter selection to evaluate credit risk.

In the first unitary SVM model presented in Chapter 2, a design of experiment (DOE) method is used to determine the optimal parameters of the

SVM model and simultaneously a nearest point algorithm (NPA) is used to

obtain quickly the solutions of the SVM model with optimal parameters. In

the second unitary SVM model given in Chapter 3, its parameters are determined by a direct search (DS) algorithm. Meantime, some other parameter selection methods, such as genetic algorithm (GA), grid search

(GS) algorithm, and design of experiment (DOE), are also conducted to

compare the performance of different parameter selection methods when

the proposed unitary SVM models with optimal parameter are applied to

credit risk evaluation and analysis.

In accordance with the previous analysis in the survey, the hybrid and

ensemble models usually achieve better classification performance than

the unitary SVM models. For this purpose, Part III and Part IV present four

hybrid models and four ensemble models, respectively.

In the first hybrid model of Part III, rough set theory (RST) and SVM

are hybridized into a synergetic model for credit risk classification and

analysis. Different from the existing hybrid approach integrating RST and

Preface VII

SVM, SVM is used for feature selection and then RST is used to generate

classification rules for credit risk evaluation in the proposed hybrid model.

In terms of computational complexity problem of SVM, the second hybrid

model incorporates fuzzy set theory (FST) and least squares SVM

(LSSVM) to create a least squares fuzzy SVM (LS-FSVM) for credit risk

assessment. Subsequently, a bilateral-weighted fuzzy SVM (FSVM) model

hybridizing SVM and FST is proposed for credit risk assessment in the

third hybrid model. In the new fuzzy SVM model, we treat every sample

as both positive and negative classes, but with different memberships,

which is generated by fuzzy set theory. This model is applied to three typical credit datasets and obtains good classification performance. Finally, an

evolving LSSVM model based on genetic algorithm (GA) is proposed for

credit risk analysis and evaluation. This model consists of two main evolutions: input feature evolution and parameter evolution. On one hand, a

standard GA is first used to search the possible combination of input features. The input features selected with GA are used to train LSSVM. On

the other hand, another GA is used to optimize parameter of LSSVM using

the feature evolved LSSVM. For the purpose of verification, three different credit datasets are used and accordingly satisfied classification results

are reported.

In the four ensemble models of Part IV, the first model presents a multistage ensemble framework to formulate an SVM ensemble learning approach for credit risk evaluation. The second ensemble model introduces a

metalearning strategy to construct a SVM-based metamodeling ensemble

method. In a sense, the proposed SVM ensemble model is actually an

SVM metamodel. In the third ensemble model, an evolutionary programming (EP) based knowledge ensemble model is proposed for credit risk

evaluation and analysis. In the last chapter of Part IV, a novel intelligentagent-based multicriteria fuzzy GDM model is proposed as a multicriteria

decision-making (MCDM) tool to support credit risk assessment. Different

from the commonly used “one-member-one-vote” or the “majority-votingrule” ensemble models, the novel fuzzy GDM model first uses several intelligent agents to evaluate the customers over a number of criteria, then

the evaluation results are fuzzified into some fuzzy judgments, and finally

these fuzzy judgments are aggregated and defuzzified into a group consensus as a final group decision measurement.

We would like to thank many colleagues and friends for their help and

support in preparing this monograph. First, we thank Professor Y. Nakamori of Japan Advanced Institute of Science and Technology, Professor

Yongqiao Wang of Zhejiang University of Commerce and Professor Wei

Huang of Huazhong University of Science and Technology for their contributions to the studies in this monograph. Three chapters are based on the

VIII Preface

results that we achieved jointly with them. We would like to thank several

other scientists for their helpful suggestions and valuable comments on our

research in this area, among them are Professor Wuyi Yue of Konan University in Japan, Professor M. Makowski of International Institute of Applied Systems Analysis in Austria, Professor Serge Hayward of Groups

ESC Dijon Bourgogne in France and Professor Heping Pan of International Institute for Financial Prediction in Australia.

Finally, we would like to thank the National Natural Science Foundation

of China (NSFC), the Knowledge Innovation Program of Chinese Academy of Sciences (CAS), the Academy of Mathematics and Systems Science (AMSS) of CAS, the Hong Kong Research Granting Committee

(RGC), the NSFC/RGC Joint Research Scheme (No. N_CityU110/07) and

City University of Hong Kong for their financial support to our research in

this promising area.

Lean YU

Institute of Systems Science

Academy of Mathematics and Systems Science

Chinese Academy of Sciences

Beijing, 100190, China

Email: [email protected]

Shouyang WANG

Institute of Systems Science

Academy of Mathematics and Systems Science

Chinese Academy of Sciences

Beijing, 100190, China

Email: [email protected]

Kin Keung LAI

Department of Management Sciences

City University of Hong Kong

83 Tat Chee Avenue, Kowloon, Hong Kong

Email: [email protected]

Ligang ZHOU

Department of Management Sciences

City University of Hong Kong

83 Tat Chee Avenue, Kowloon, Hong Kong

Email: [email protected]

December, 2007

List of Figures

Fig. 1.1 Distribution of articles by year ............................................................7

Fig. 1.2 Structure of one typical multilayer perceptron................................13

Fig. 1.3 Separating hyperplane for two separable classes with maximal

margin..................................................................................................15

Fig. 1.4 Performance comparison of different models on different credit

datasets ................................................................................................19

Fig. 2.1 An illustration for DOE parameter search with two iterations ......34

Fig. 3.1 An illustrative example for grid search.............................................49

Fig. 3.2 Performance of each parameter pair in initial range.......................52

Fig. 3.3 Sensitivity analysis of OA on initial space setting ............................52

Fig. 3.4 Sensitivity analysis of Se on initial space setting ..............................53

Fig. 3.5 Sensitivity analysis of Sp on initial space setting..............................53

Fig. 4.1 General framework of the hybrid intelligent mining system ..........63

Fig. 7.1 General framework of the evolving LSSVM learning paradigm..112

Fig. 7.2 Input feature selection with GA for LSSVM ..................................114

Fig. 7.3 Performance of feature evolved LSSVM on three credit datasets 125

Fig. 8.1 General process of multistage SVM ensemble learning model .....141

Fig. 8.2 Bagging sampling algorithm ............................................................142

Fig. 9.1 A generic metalearning process .......................................................162

Fig. 9.2 An extended metalearning process ..................................................164

Fig. 9.3 SVM-based metamodeling process..................................................166

Fig. 9.4 Graphical illustration for SVM-based metalearning process........172

Fig. 9.5 Performance comparisons with the different numbers of base models .......................................................................................................176

Fig. 11.1 An illustrative sketch of the intelligent-agent-based multicriteria

GDM model........................................................................................201

Fig. 11.2 ROC curve and AUC for two different models ..............................209

X List of Figures

Fig. 11.3 A group decision table for credit scoring........................................ 211

Fig. 11.4 A graphic comparison for different models in German dataset ... 218

Fig. 11.5 Improving the global performance of a classifier .......................... 221

List of Tables

Table 1.1 Books about credit risk modeling and analysis .................................6

Table 1.2 Comparisons of different credit risk models based on different criteria.....................................................................................................18

Table 1.3 Accuracy comparison of different quantitative models..................20

Table 1.4 SVM models and their factors ..........................................................22

Table 2.1 Performance comparison of different methods on German credit

dataset.................................................................................................37

Table 3.1 Independent variables in credit evaluation model for German

dataset.................................................................................................48

Table 3.2 Parameter settings for DS, GA, GS and DOE.................................50

Table 3.3 Evaluation results of four methods for LSSVM on German dataset ........................................................................................................51

Table 3.4 Evaluation results of four methods for LSSVM on Australian dataset ........................................................................................................51

Table 3.5 Performance comparisons of different classifiers ...........................54

Table 4.1 Comparisons of different methods on corporation credit dataset .70

Table 4.2 Comparisons of different methods on consumer credit dataset.....71

Table 5.1 Variables of the experimental dataset..............................................81

Table 5.2 Credit risk evaluation results by LS-FSVM ....................................83

Table 5.3 Performance comparisons of different classifiers ...........................83

Table 6.1 Empirical results on the dataset 1 ....................................................97

Table 6.2 Empirical results on the dataset 2 ....................................................99

Table 6.3 Empirical results on the dataset 3 ..................................................101

Table 7.1 Selected key features by GA-based feature selection procedure .124

Table 7.2 Optimal solutions of different parameters for LSSVM................127

Table 7.3 Computational performance comparisons using different parameter search methods for three credit datasets .................................129

Table 7.4 Performance comparisons of different models..............................130

XII List of Tables

Table 8.1 Consumer credit evaluation results with different methods ...... 150

Table 8.2 Corporation credit evaluation results with different methods... 153

Table 9.1 Performance comparison with different evaluation approaches175

Table 10.1 Identification results of MDA and logit regression models........ 190

Table 10.2 Identification results of BPNN models with different designs ... 190

Table 10.3 Identification performance of SVM with different parameters. 192

Table 10.4 Identification performance of different knowledge ensemble models .................................................................................................... 193

Table 10.5 Identification performance comparisons with different models 194

Table 10.6 McNemar values for performance pairwise comparisons.......... 194

Table 11.1 Performance comparisons with different models for England

dataset............................................................................................. 212

Table 11.2 Performance comparisons with different models for Japanese

dataset............................................................................................. 216

Table 11.3 Performance comparisons with different models for German

dataset............................................................................................. 218

Table 11.4 McNemar’s test for pairwise performance comparison............. 219

Table of Contents

Part I

Credit Risk Analysis with Computational Intelligence:

An Analytical Survey................................................................................. 1

1 Credit Risk Analysis with Computational Intelligence: A Review..... 3

1.1 Introduction........................................................................................ 3

1.2 Literature Collection.......................................................................... 5

1.3 Literature Investigation and Analysis ................................................ 7

1.3.1 What is Credit Risk Evaluation Problem? .................................. 8

1.3.2 Typical Techniques for Credit Risk Analysis............................. 8

1.3.3 Comparisons of Models ............................................................ 17

1.4 Implications on Valuable Research Topics...................................... 23

1.5 Conclusions...................................................................................... 24

Part II

Unitary SVM Models with Optimal Parameter Selection for Credit

Risk Evaluation........................................................................................ 25

2 Credit Risk Assessment Using a Nearest-Point-Algorithm-based

SVM with Design of Experiment for Parameter Selection.............. 27

2.1 Introduction...................................................................................... 27

2.2 SVM with Nearest Point Algorithm ................................................ 29

2.3 DOE-based Parameter Selection for SVM with NPA ..................... 33

2.4 Experimental Analysis..................................................................... 35

2.5 Conclusions...................................................................................... 38

3 Credit Risk Evaluation Using SVM with Direct Search for Parameter Selection .......................................................................................... 41

3.1 Introduction...................................................................................... 41

3.2 Methodology Description ................................................................ 43

3.2.1 Brief Review of LSSVM .......................................................... 43

3.2.2 Direct Search for Parameter Selection...................................... 45

3.3 Experimental Study.......................................................................... 47

3.3.1 Research Data ........................................................................... 47

XIV Table of Contents

3.3.2 Parameter Selection with Genetic Algorithm ........................... 48

3.3.3 Parameters Selection with Grid Search..................................... 49

3.3.4 Experimental Results ................................................................ 50

3.4 Conclusions...................................................................................... 54

Part III

Hybridizing SVM and Other Computational Intelligent Techniques

for Credit Risk Analysis.......................................................................... 57

4 Hybridizing Rough Sets and SVM for Credit Risk Evaluation........ 59

4.1 Introduction...................................................................................... 59

4.2 Preliminaries of Rough Sets and SVM ............................................ 61

4.2.1 Basic Concepts of Rough Sets.................................................. 61

4.2.2 Basic Ideas of Support Vector Machines.................................. 62

4.3 Proposed Hybrid Intelligent Mining System ................................... 63

4.3.1 General Framework of Hybrid Intelligent Mining System....... 63

4.3.2 2D-Reductions by Rough Sets.................................................. 64

4.3.3 Feature Selection by SVM........................................................ 65

4.3.4 Rule Generation by Rough Sets................................................ 66

4.3.5 General Procedure of the Hybrid Intelligent Mining System ... 67

4.4 Experiment Study ............................................................................ 68

4.4.1 Corporation Credit Dataset ....................................................... 69

4.4.2 Consumer Credit Dataset .......................................................... 70

4.5 Concluding Remarks........................................................................ 72

5 A Least Squares Fuzzy SVM Approach to Credit Risk Assessment 73

5.1 Introduction...................................................................................... 73

5.2 Least Squares Fuzzy SVM............................................................... 74

5.2.1 SVM.......................................................................................... 74

5.2.2 FSVM ....................................................................................... 77

5.2.3 Least Squares FSVM ................................................................ 79

5.3 Experiment Analysis........................................................................ 81

5.4 Conclusions...................................................................................... 84

6 Evaluating Credit Risk with a Bilateral-Weighted Fuzzy SVM

Model.................................................................................................... 85

6.1 Introduction...................................................................................... 85

6.2 Formulation of the Bilateral-Weighted Fuzzy SVM Model ............ 89

6.2.1 Bilateral-Weighting Errors ....................................................... 89

6.2.2 Formulation Process of the Bilateral-weighted fuzzy SVM ..... 91

6.2.3 Generating Membership ........................................................... 93

6.3 Empirical Analysis........................................................................... 95

Table of Contents XV

6.3.1 Dataset 1: UK Case................................................................... 96

6.3.2 Dataset 2: Japanese Case .......................................................... 98

6.3.3 Dataset 3: England Case ......................................................... 100

6.4 Conclusions.................................................................................... 102

7 Evolving Least Squares SVM for Credit Risk Analysis .................. 105

7.1 Introduction.................................................................................... 105

7.2 SVM and LSSVM.......................................................................... 108

7.3 Evolving LSSVM Learning Paradigm........................................... 111

7.3.1 General Framework of Evolving LSSVM Learning Method . 111

7.3.2 GA-based Input Features Evolution........................................ 113

7.3.3 GA-based Parameters Evolution............................................. 117

7.4 Research Data and Comparable Models........................................ 119

7.4.1 Research Data ......................................................................... 119

7.4.2 Overview of Other Comparable Classification Models.......... 121

7.5 Experimental Results ..................................................................... 123

7.5.1 Empirical Analysis of GA-based Input Features Evolution.... 123

7.5.2 Empirical Analysis of GA-based Parameters Optimization ... 126

7.5.3 Comparisons with Other Classification Models ..................... 129

7.6 Conclusions.................................................................................... 131

Part IV

SVM Ensemble Learning for Credit Risk Analysis............................ 133

8 Credit Risk Evaluation Using a Multistage SVM Ensemble Learning

Approach ............................................................................................. 135

8.1 Introduction.................................................................................... 135

8.2 Previous Studies............................................................................. 138

8.3 Formulation of SVM Ensemble Learning Paradigm ..................... 140

8.3.1 Partitioning Original Data Set................................................. 140

8.3.2 Creating Diverse Neural Network Classifiers......................... 142

8.3.3 SVM Learning and Confidence Value Generation................. 143

8.3.4 Selecting Appropriate Ensemble Members ............................ 144

8.3.5 Reliability Value Transformation ........................................... 146

8.3.6 Integrating Multiple Classifiers into an Ensemble Output...... 146

8.4 Empirical Analysis......................................................................... 148

8.4.1 Consumer Credit Risk Assessment......................................... 149

8.4.2 Corporation Credit Risk Assessment ...................................... 151

8.5 Conclusions.................................................................................... 154

9 Credit Risk Analysis with a SVM-based Metamodeling Ensemble

Approach ............................................................................................. 157

Thư viện tri thức trực tuyến

Bio-inspired credit risk analysis

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

bio inspired synthesis of monodispersed silver nano particles using sapindus emarginatus pericarp

From cell to robot a bio inspired locomotion device

Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence (Springer Series on Bio- and Neurosystems - Volume 7)

Bioinspired aromatic foldamers and their potential applications

Bioinspired solid catalysts for the hydroxylation of methane

Bioinspired selfrepairing slippery surfaces with pressurestable omniphobicity