Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Applied multivariate statistical analysis
PREMIUM
Số trang
788
Kích thước
16.8 MB
Định dạng
PDF
Lượt xem
1079

Applied multivariate statistical analysis

Nội dung xem thử

Mô tả chi tiết

International Edition

CD-ROM

INCLUDED

I

A pplied

M ultivariate

S tatistical

A nalysis

Fifth Edition

R ichard A. J o h n s o n

D ean W. W ichern

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn1

A pplied Multivariate

Statistical Analysis

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn2

A p plied Multivariate

Statistical Analysis

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn3

FIFTH EDITION

Applied Multivariate

Statistical Analysis

RICHARD A. JOHNSON

University o f Wisconsin— Madison

DEAN W. WICHERN

Texas A S M University

Pearson Education International

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn4

If you have purchased this book within the United States or Canada you should be aware that

it has been wrongfully imported without the approval of the Publisher or the Author.

Acquisitions Editor: Quincy McDonald

Editor-in-Chief: Sally Yagan

Vice President/Director Production and Manufacturing: David W. Riccardi

Executive Managing Editor: Kathleen Schiaparelli

Senior Managing E d ito r Linda Mihatov Behrens

Assistant Managing Editor: Bayani DeLeon

Production Editor: Steven S. Pawlowski

Manufacturing Buyer: Alan Fischer

Manufacturing Manager: Trudy Pisciotti

Marketing Manager: Angela Battle

Editorial Assistant/Supplements Editor: Joanne Wendelken

Managing Editor, Audio/Video Assets: Grace Hazeldine

Art Director: Jayne Conte

Cover Designer: Bruce Kenselaar

Illustrator: Marita Froimson

© 2002.1998.1992.1988.1982 by Prentice-Hall. Inc.

Upper Saddle River, NJ 07458

All rights reserved. No part of this book may be reproduced, in any form or by any means,

without permission in writing from the publisher.

Printed in the United States o f America

10 9876543

ISBN 0-13-121973-1

Pearson Education LTD.

Pearson Education Australia PTY, Limited

Pearson Education Singapore, Pte. Ltd.

Pearson Education North Asia Ltd.

Pearson Education Canada, Ltd.

Pearson Educacion de Mexico, S.A. de C.V.

Pearson Education - Japan

Pearson Education Malaysia, Pte. Ltd.

Pearson Education, Upper Saddle River, New Jersey Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn5

To the memory of my mother and my father.

R. A. J.

To Dorothy, Michael, and Andrew.

D. W. W.

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn6

Contents

PREFA CE X V

1 A SP ECTS O F M ULTIVARIATE A N A LY SIS 1

1.1 Introduction 1

1.2 Applications of Multivariate Techniques 3

1.3 The Organization of Data 5

Arrays, 5

D escriptive Statistics, 6

G ra p h ica l Techniques, 11

1.4 Data Displays and Pictorial Representations 19

L in k in g M u ltip le T w o -D im en sio n a l Scatter Plots, 20

G ra p h s o f G ro w th Curves, 24

Stars, 25

C h e r n o ff Faces, 28

1.5 Distance 30

1.6 Final Comments 38

Exercises 38

References 48

2 M ATRIX A LG EBR A A N D RA N DO M VECTORS 50

2.1 Introduction 50

2.2 Some Basics of Matrix and Vector Algebra 50

Vectors, 50

M atrices, 55

2.3 Positive Definite Matrices 61

2.4 A Square-Root Matrix 66

2.5 Random Vectors and Matrices 67

2.6 Mean Vectors and Covariance Matrices 68

P artitioning the C ovariance M atrix, 74

T h e M ean Vector a n d C ovariance M atrix

fo r L in ea r C o m b in a tio n s o f R a n d o m Variables, 76

P artitioning the S a m p le M ean Vector

a n d C ovariance M atrix, 78

2.7 Matrix Inequalities and Maximization 79

vii

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn7

viii Contents

Supplement 2A: Vectors and Matrices: Basic Concepts 84

Vectors, 84

M atrices, 89

Exercises 104

References 111

3 SA M PLE G EO M ETRY A N D RAN DO M SAM PLING

3.1 Introduction 112

3.2 The Geometry of the Sample 112

3.3 Random Samples and the Expected Values of the Sample Mean and

Covariance Matrix 120

3.4 Generalized Variance 124

S ituations in w hich the G en era lized S a m p le Variance Is Zero, 130

G eneralized Variance D eterm in ed by \ R

a n d Its G eom etrical Interpretation, 136

A n o th e r G eneralization o f Variance, 138

3.5 Sample Mean, Covariance, and Correlation

As Matrix Operations 139

3.6 Sample Values of Linear Combinations of Variables 141

Exercises 145

References 148

4 THE M ULTIVARIATE NO RM AL DISTRIBUTION

4.1 Introduction 149

4.2 The Multivariate Normal Density and Its Properties 149

A d d itio n a l P roperties o f the M ultivariate

N o rm a l D istribution, 156

4.3 Sampling from a Multivariate Normal Distribution

and Maximum Likelihood Estimation 168

T he M ultivariate N o r m a l L ike lih o o d , 168

M a x im u m L ik e lih o o d E stim a tio n o f pi a n d X, 170

S u fficien t Statistics, 173

4.4 The Sampling Distribution of X and S 173

P roperties o f the W ishart D istribution, 174

4.5 Large-Sample Behavior of X and 5 175

4.6 Assessing the Assumption of Normality 177

E v a lu a tin g the N o rm a lity o f the U nivariate M a rg in a l D istributions, 178

E valuating B ivariate N o rm a lity, 183

4.7 Detecting Outliers and Cleaning Data 189

Steps fo r D etecting O utliers, 190

4.8 Transformations To Near Normality 194

T ra n sfo rm in g M ultivariate O bservations, 198

Exercises 202

References 209

112

149

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn8

Contents ix

5 INFERENCES A B O U T A M EAN VECTOR

5.1 Introduction 210

5.2 The Plausibility of Mo as a Value for a Normal

Population Mean 210

5.3 Hotelling's T 2 and Likelihood Ratio Tests 216

G eneral L ik e lih o o d R atio M ethod, 219

5.4 Confidence Regions and Simultaneous Comparisons

of Component Means 220

S im u lta n eo u s C onfidence Statem ents, 223

A C om parison o f Sim u lta n eo u s C onfidence Intervals

with O ne-at-a-T im e Intervals, 229

T he B o n ferro n i M eth o d o f M u ltip le C om parisons, 232

5.5 Large Sample Inferences about a Population Mean Vector 234

5.6 Multivariate Quality Control Charts 239

C harts fo r M o n ito rin g a S a m p le o f In d iv id u a l M ultivariate O bservations

fo r Stability, 241

C o n tro l R eg io n s fo r Future In d iv id u a l O bservations, 241

C o n tro l E llipse fo r Future O bservations, 248

T 2-C hart fo r Future O bservations, 248

C o n tro l C harts B a sed on S u b sa m p le M eans, 249

C o n tro l R egions fo r Future S u b sa m p le O bservations, 251

5.7 Inferences about Mean Vectors

when Some Observations Are Missing 252

5.8 Difficulties Due to Time Dependence

in Multivariate Observations 256

Supplement 5A: Simultaneous Confidence Intervals and Ellipses

as Shadows of the p-Dimensional Ellipsoids 258

Exercises 260

References 270

6 CO M PA RISO N S O F SEV ERA L M ULTIVARIATE M EAN S

6.1 Introduction 272

6.2 Paired Comparisons and a Repeated Measures Design 272

P aired C om parisons, 272

A R epeated M easures D esign fo r C o m p a rin g Treatments, 278

6.3 Comparing Mean Vectors from Two Populations 283

A s s u m p tio n s C oncerning the Structure o f the D ata, 283

F urther A ssu m p tio n s w hen n , a n d n ,A r e Sm all, 284

S im u lta n eo u s C o n fid en ce Intervals, 287

T he T w o-Sam ple Situation w hen X, ^ X2, 290

6.4 Comparing Several Multivariate Population Means

(One-Way Manova) 293

A ssu m p tio n s a b o u t the Structure o f the D ata fo r O n e-w a y M A N O V A , 293

A S u m m a ry o f U nivariate A N O V A , 293

M ultivariate A n a lysis o f Variance (M A N O V A ), 298

210

272

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn9

6.5 Simultaneous Confidence Intervals for Treatment Effects 305

6.6 Two-Way Multivariate Analysis of Variance 307

Univariate Two-W ay F ixed-E ffects M o d el with Interaction, 307

M ultivariate Tw o-W ay F ixed-E ffects M o d el w ith Interaction, 309

6.7 Profile Analysis 318

6.8 Repeated Measures Designs and Growth Curves 323

6.9 Perspectives and a Strategy for Analyzing

Multivariate Models 327

Exercises 332

References 352

x Contents

7 M ULTIVARIATE LIN EAR REGRESSIO N M ODELS

7.1 Introduction 354

7.2 The Classical Linear Regression Model 354

7.3 Least Squares Estimation 358

S u m -o f-S q u a res D eco m p o sitio n , 360

G eo m etry o f L east Squares, 361

S a m p lin g Properties o f Classical L east Squares Estim ators, 363

7.4 Inferences About the Regression Model 365

Inferences C oncerning the Regression Parameters, 365

L ik e lih o o d Ratio Tests fo r the Regression Parameters, 370

7.5 Inferences from the Estimated Regression Function 374

E stim ating the Regression F unction at z0, 374

Forecasting a N ew O bservation at zf;, 375

7.6 Model Checking and Other Aspects of Regression 377

D oes the M o d el Fit?, 377

L everage a n d Influence, 380

A d d itio n a l P ro b lem s in L in ea r Regression, 380

7.7 Multivariate Multiple Regression 383

L ik e lih o o d R atio Tests fo r Regression Param eters, 392

O th er M ultivariate Test Statistics, 395

P redictions fr o m M ultivariate M u ltip le Regressions, 395

7.8 The Concept of Linear Regression 398

Prediction o f Several Variables, 403

Partial C orrelation C oefficient, 406

7.9 Comparing the Two Formulations of the Regression Model 407

M ean C orrected Form o f the Regression M odel, 407

Relating the Form ulations, 409

7.10 Multiple Regression Models with Time Dependent Errors 410

Supplement 7A: The Distribution of the Likelihood Ratio

for the Multivariate Multiple Regression Model 415

Exercises 417

References 424

354

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn10

Contents xi

8 PRINCIPAL COM PONENTS

8.1 Introduction 426

8.2 Population Principal Components 426

P rincipal C o m p o n en ts O btained fr o m S ta n d a rd ized Variables, 432

P rincipal C o m p o n en ts fo r Covariance M atrices

with Special Structures, 435

8.3 Summarizing Sample Variation by Principal Components 437

The N u m b er o f P rincipal C om ponents, 440

Interpretation o f the Sam ple P rincipal C om ponents, 444

S ta n d a rd izin g the S a m p le P rincipal C om ponents. 445

8.4 Graphing the Principal Components 450

8.5 Large Sample Inferences _ 452

L arge S a m p le Properties o f A, a n d e, , 452

Testing fo r the E q u a l C orrelation Structure, 453

8.6 Monitoring Quality with Principal Components 455

C h eckin g a G iven Set o f M easurem ents fo r Stability, 455

C o n tro llin g Future Values, 459

Supplement 8A: The Geometry of the Sample Principal

Component Approximation 462

T he p -D im en sio n a l G eom etrical Interpretation, 464

T he n -D im en sio n a l G eom etrical Interpretation, 465

Exercises 466

References 475

9 FA CTO R A N A LY SIS A N D INFERENCE

FOR STRU CTU RED CO VARIANCE M ATRICES

9.1 Introduction 477

9.2 The Orthogonal Factor Model 478

9.3 Methods of Estimation 484

T he P rincipal C o m p o n en t (and P rincipal Factor) M ethod, 484

A M vilified A p p ro a ch — the P rincipal Factor Solution. 490

T he M a xim u m L ik e lih o o d M eth o d , 492

A L arge S a m p le Test fo r the N u m b er o f C o m m o n Factors, 498

9.4 Factor Rotation 501

O b liq u e R otations, 509

9.5 Factor Scores 510

T he W eighted L ea st Squares M ethod, 511

T he Regression M ethod, 513

9.6 Perspectives and a Strategy for Factor Analysis 517

9.7 Structural Equation Models 524

T he L I S R E L M odel, 525

C onstruction o f a Path D iagram , 525

C ovariance Structure, 526

E stim ation, 527

M odel-F itting Strategy, 529

426

477

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn11

xii Contents

Supplement 9A: Some Computational Details

for Maximum Likelihood Estimation 530

R eco m m en d e d C o m p u ta tio n a l Schem e, 531

M a xim u m L ik e lih o o d E stim ators o f P = L ,L ',+ i|I, ,5 3 2

Exercises 533

References 541

10 CANON ICAL CORRELATION A N A LY SIS

10.1 Introduction 543

10.2 Canonical Variates and Canonical Correlations 543

10.3 Interpreting the Population Canonical Variables 551

Id en tifyin g the C anonical Variables, 551

C anonical C orrelations as G eneralizations

o f O ther C orrelation C oefficients, 553

T h e First r C anonical Variables as a S u m m a ry o f Variability, 554

A G eom etrical Interpretation o f the P opulation C anonical

C orrelation A n a lysis 555

10.4 The Sample Canonical Variates and Sample

Canonical Correlations 556

10.5 Additional Sample Descriptive Measures 564

M atrices o f E rrors o f A p p ro xim a tio n s, 564

P ro p o rtio n s o f E xp la in ed S a m p le Variance, 567

10.6 Large Sample Inferences 569

Exercises 573

References 580

11 DISCRIM INATION A N D CLA SSIFICA TIO N

11.1 Introduction 581

11.2 Separation and Classification for Two Populations 582

11.3 Classification with Two Multivariate Normal Populations 590

C lassification o f N o rm a l P o p ulations W h en S i = I . - X, 590

Scaling, 595

C lassification o f N o rm a l P o p u la tio n s W h en 2 , # X2, 596

11.4 Evaluating Classification Functions 598

11.5 Fisher’s Discriminant Function—Separation of Populations 609

11.6 Classification with Several Populations 612

T he M in im u m E xp ected C ost o f M isclassification M eth o d , 613

C lassification with N o rm a l Populations, 616

11.7 Fisher’s Method for Discriminating

among Several Populations 628

U sing Fisher's D iscrim in a n ts to C lassify O bjects, 635

11.8 Final Comments 641

In clu d in g Q ualitative Variables, 641

C lassification Trees, 641

N eural N etw orks, 644

543

581

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn12

Contents xiii

Selection o f Variables, 645

Testing fo r G ro u p D ifferences, 645

G raphics, 646

Practical C onsiderations Regarding M ultivariate N orm ality, 646

Exercises 647

References 666

12 CLUSTERING, D ISTA N CE M ETHODS, A N D ORDINATION

12.1 Introduction 668

12.2 Similarity Measures 670

D istances a n d Sim ilarity C oefficients fo r Pairs o f Items, 670

Sim ilarities a n d A sso cia tio n M easures

fo r Pairs o f Variables, 676

C o n clu d in g C o m m e n ts on Sim ilarity, 677

12.3 Hierarchical Clustering Methods 679

S in g le L in ka g e, 681

C o m p lete L in ka g e, 685

A v era g e L in ka g e, 689

W a rd ’s H ierarchical C lustering M ethod, 690

F inal C o m m e n ts— H ierarchical Procedures, 693

12.4 Nonhierarchical Clustering Methods 694

K -m ea n s M eth o d , 694

F inal C o m m en ts■— N onhierarchical Procedures, 698

12.5 Multidimensional Scaling 700

T h e B a sic A lg o rith m , 700

12.6 Correspondence Analysis 709

A lg eb ra ic D evelo p m e n t o f C orrespondence A nalysis, 711

Inertia, 718

Interpretation in Two D im ensions, 719

F inal C om m ents. 719

12.7 Biplots for Viewing Sampling Units and Variables 719

C o n stru ctin g Biplots, 720

12.8 Procrustes Analysis: A Method

for Comparing Configurations 723

C o n stru ctin g th e P rocrustes M easure o f A greem ent, 724

Supplement 12A: Data Mining 731

In tro d u ctio n , 731

T h e D a ta M in in g Process, 732

M o d e l A ssessm en t, 733

Exercises 738

References 745

A P P EN D IX

DATA IN D EX

668

748

758

S U B JE C T IN D EX 761 Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn13

Preface

INTENDED AUDIENCE

This book originally grew out of our lecture notes for an "Applied Multivariate Analy￾sis” course offered jointly by the Statistics Department and the School of Business at

the University of Wisconsin-Madison. Applied Multivariate Statistical Analysis, Fifth

Edition, is concerned with statistical methods for describing and analyzing multi￾variate data. Data analysis, while interesting with one variable, becomes truly fasci￾nating and challenging when several variables are involved. Researchers in the

biological, physical, and social sciences frequently collect measurements on several

variables. Modern computer packages readily provide the numerical results to rather

complex statistical analyses. We have tried to provide readers with the supporting

knowledge necessary for making proper interpretations, selecting appropriate tech￾niques. and understanding their strengths and weaknesses. We hope our discussions

will meet the needs of experimental scientists, in a wide variety of subject matter

areas, as a readable introduction to the statistical analysis of multivariate observations.

LEVEL

Our aim is to present the concepts and methods of multivariate analysis at a level

that is readily understandable bv readers who have taken two or more statistics cours￾es. We emphasize the applications of multivariate methods and, consequently, have

attempted to make the mathematics as palatable as possible. We avoid the use of cal￾culus. On the other hand, the concepts of a matrix and of matrix manipulations are

important. We do not assume the reader is familiar with matrix algebra. Rather, we

introduce matrices as they appear naturally in our discussions, and we then show how

they simplify the presentation of multivariate models and techniques.

The introductory account of matrix algebra, in Chapter 2. highlights the more

important matrix algebra results as they apply to multivariate analysis. The Chapter

2 supplement provides a summary of matrix algebra results for those with little or no

previous exposure to the subject. This supplementary material helps make the book

self-contained and is used to complete proofs. The proofs may be ignored on the first

reading. In this way we hope to make the book accessible to a wide audience.

In our attempt to make the study of multivariate analysis appealing to a large

audience of both practitioners and theoreticians, we have had to sacrifice a consistency

xv

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn14

xvi Preface

of level. Some sections are harder than others. In particular, we have summarized a

voluminous amount of material on regression in Chapter 7. The resulting presenta￾tion is rather succinct and difficult the first time through. We hope instructors will he

able to compensate for the unevenness in level by judiciously choosing those sec￾tions. and subsections, appropriate for their students and by toning them down if

necessary.

ORGANIZATION AND APPROACH

The methodological "tools" of multivariate analysis are contained in Chapters 5

through 12. These chapters represent the heart of the book, but they cannot be as￾similated without much of the material in the introductory Chapters 1 through 4.

Even those readers with a good knowledge of matrix algebra or those willing to ac￾cept the mathematical results on faith should, at the very least, peruse Chapter 3.

“Sample Geometry,” and Chapter 4, “Multivariate Normal Distribution."

Our approach in the methodological chapters is to keep the discussion direct and

uncluttered. Typically, we start with a formulation of the population models, delineate

the corresponding sample results, and liberally illustrate everything with examples. The

examples are of two types: those that are simple and whose calculations can be eas￾ily done by hand, and those that rely on real-world data and computer software. These

will provide an opportunity to (1) duplicate our analyses. (2) carry out the analyses

dictated by exercises, or (3) analyze the data using methods other than the ones we

have used or suggested.

The division of the methodological chapters (5 through 12) into three units al￾lows instructors some flexibility in tailoring a course to their needs. Possible sequences

for a one-semester (two quarter) course are indicated schematically.

Each instructor will undoubtedly omit certain sections from some chapters to

cover a broader collection of topics than is indicated by these two choices.

For most students, we would suggest a quick pass through the first four chap￾ters (concentrating primarily on the material in Chapter 1; Sections 2.1,2.2,2.3.2.5.

2.6. and 3.6: and the "assessing normality” material in Chapter 4) followed by a se￾lection of methodological topics. For example, one might discuss the comparison of

mean vectors, principal components, factor analysis, discriminant analysis and clus￾tering. The discussions could feature the many “worked out” examples included in

Số hóa bởi Trung tâm Học liệu – ĐH TN http://www.lrc-tnu.edu.vn15

Tải ngay đi em, còn do dự, trời tối mất!