Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Applied statistics: from bivariate through multivariate techniques
Nội dung xem thử
Mô tả chi tiết
Applie d
STATISTIC S
From Bivariate Through Multivariate Techniques
Rebecc a M . Warne r
University of New Hampshire
D AI HOC THA I NGUYEN
TRUNG TAM HOC LIEU
*)SAG E Publication s
'55' Los Angeles • London • New Delhi • Singapore
Copyright © 2008 by Sage Publications, Inc.
All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic
or mechanical, including photocopying, recording, or by any information storage and retrieval system, without
permission in writing from the publisher.
For information:
Sage Publications, Inc.
2455 Teller Road
Thousand Oaks, California 91320
E-mail: [email protected]
Sage Publications India Pvt. Ltd.
B 1/11 Mohan Cooperative Industrial Area
Mathura Road, New Delhi 110 044
India
Sage Publications Ltd.
1 Oliver's Yard
55 City Road
London EC1Y ISP
United Kingdom
Sage Publications Asia-Pacific Pte. Ltd.
33 Pekin Street #02-01
Far East Square
Singapore 048763
Printed in the United States of America
Library of Congress Cataloging-in-Publication Data
Warner, Rebecca M.
Applied statistics: from bivariate through multivariate techniques/Rebecca M. Warner.
p. cm.
Includes bibliographical references and index.
ISBN-13:978-0-7619-2772-3 (cloth)
1. Social sciences—Statistical methods. 2. Psychology—Statistical methods. 3. Multivariate analysis. I. Title.
HA31.35.W37 2007
519.5'35—dc22 2006033700
This book is printed on acid-free paper.
09 10 11 10 9 8 7 6 5 4 3 2
Acquisitions Editor: Vicki Knight
Associate Editor: Sean Connelly
Editorial Assistant: Lauren Habib
Production Editor: Laureen A. Shea
Copy Editors: Linda Gray and QuADS
Typesetter: C&M Digitals (P) Ltd.
Indexer: Will Ragsdale
Cover Designer: Candice Harman
Marketing Manager: Stephanie Adams
Content s
Preface xxj
Acknowledgments xxv
Chapter 1. Review of Basic Concepts 1
1.1 Introduction 1
1.2 A Simple Example of a Research Problem 2
1.3 Discrepancies Between Real and Ideal Research Situations 2
1.4 Samples and Populations 3
1.5 Descriptive Versus Inferential Uses of Statistics 4
1.6 Levels of Measurement and Types of Variables 6
1.7 The Normal Distribution 10
1.8 Research Design 15
1.8.1 Experimental Design 16
1.8.2 Quasi-Experimental Design 19
1.8.3 Nonexperimental Research Design 19
1.8.4 Between-Subjects Versus Within-Subjects
or Repeated Measures 20
1.9 Parametric Versus Nonparametric Statistics 21
1.10 Additional Implicit Assumptions 25
1.11 Selection of an Appropriate Bivariate Analysis 26
1.12 Summary 29
Comprehension Questions 37
Chapter 2. Introduction to SPSS: Basic Statistics,
Sampling Error, and Confidence Intervals 41
2.1 Introduction 41
2.2 Research Example: Description of a Sample of HR Scores 43
2.3 Sample Mean (M) 48
2.4 Sum of Squared Deviations and Sample Variance (s2
) 54
2.5 Degrees of Freedom (df) for a Sample Variance 55
2.6 Why Is There Variance? 57
2.7 Sample Standard Deviation (s) 58
2.8 Assessment of Location of a Single X Score Relative to a Distribution of Scores 59
2.9 A Shift in Level of Analysis: The Distribution of Values of M Across
Many Samples From the Same Population 62
2.10 An Index of Amount of Sampling Error: The Standard Error of the Mean (aM) 63
2.11 Effect of Sample Size (AO on the Magnitude of the Standard Error (au) 64
2.12 Sample Estimate of the Standard Error of the Mean (SEM) 67
2.13 The Family of f Distributions 70
2.14 Confidence Intervals 71
2.14.1 The General Form of a CI 71
2.14.2 Setting Up a CI for M When a Is Known 71
2.14.3 Setting Up a CI for M When the Value of a Is Not Known 73
2.14.4 Reporting CIs 74
2.15 Summary 75
Appendix on SPSS 76
Comprehension Questions 77
Chapter 3. Statistical Significance Testing 81
3.1 The Logic of Null Hypothesis Significance Testing (NHST) 81
3.2 Type I Versus Type II Error 84
3.3 Formal NHST Procedures: The z Test for a Null Hypothesis
About One Population Mean 85
3.3.1 Obtaining a Random Sample From the Population of Interest 86
3.3.2 Formulating a Null Hypothesis (H0) for the One-Sample z Test 86
3.3.3 Formulating an Alternative Hypothesis (#,) 87
3.3.4 Choosing a Nominal Alpha Level 89
3.3.5 Determining the Range of z Scores Used to Reject H0
89
3.3.6 Determining the Range of Values of M Used to Reject H0
90
3.3.7 Reporting an "Exact"/) Value 92
3.4 Common Research Practices Inconsistent With Assumptions and Rules for NHST 94
3.4.1 Use of Convenience Samples 95
3.4.2 Modification of Decision Rules After the Initial Decision 95
3.4.3 Conducting Large Numbers of Significance Tests 96
3.4.4 Impact of Violations of Assumptions on Risk of Type I Error 96
3.5 Strategies to Limit Risk of Type I Error 97
3.5.1 Use of Random and Representative Samples 97
3.5.2 Adherence to the Rules for NHST 97
3.5.3 Limit the Number of Significance Tests 97
3.5.4 Bonferroni-Corrected Per-Comparison Alpha Levels 98
3.5.5 Replication of Outcome in New Samples 98
3.5.6 Cross-Validation 99
3.6 Interpretation of Results 100
3.6.1 Interpretation of Null Results 100
3.6.2 Interpretation of Statistically Significant Results 101
3.7 When Is a f Test Used Instead of a z Test? 102
3.8 Effect Size 103
3.8.1. Evaluation of "Practical" (vs. Statistical) Significance 103
3.8.2 Formal Effect Size Index: Cohen's Little d 104
3.9 Statistical Power Analysis 106
3.10 Numerical Results for a One-Sample t Test Obtained From SPSS 115
3.11 Guidelines for Reporting Results 118
3.12 Summary 119
3.12.1 Logical Problems With NHST 119
3.12.2 Other Applications of the t Ratio 120
3.12.3 What Does It Mean to Say > < .05"? 122
Comprehension Questions 123
Chapter 4. Preliminary Data Screening 125
4.1 Introduction: Problems in Real Data 125
4.2 Quality Control During Data Collection 126
4.3 Example of an SPSS Data Worksheet 126
4.4 Identification of Errors and Inconsistencies 132
4.5 Missing Values 133
4.6 Empirical Example of Data Screening for Individual Variables 135
4.6.1 Frequency Distribution Tables 135
4.6.2 Removal of Impossible or Extreme Scores 137
4.6.3 Bar Chart for a Categorical Variable 140
4.6.4 Histogram for a Quantitative Variable 141
4.7 Identification and Handling of Outliers 152
4.8 Screening Data for Bivariate Analyses 156
4.8.1 Bivariate Data Screening for Two Categorical Variables 156
4.8.2 Bivariate Data Screening for One Categorical
and One Quantitative Variable 160
4.8.3 Bivariate Data Screening for Two Quantitative Variables 162
4.9 Nonlinear Relations 166
4.10 Data Transformations 169
4.11 Verifying That Remedies Had the Desired Effects 172
4.12 Multivariate Data Screening 173
4.13 Reporting Preliminary Data Screening 173
4.14 Summary and Checklist for Data Screening 176
Comprehension Questions 179
Chapter 5. Comparing Group Means Using the Independent Samples t Test 181
5.1 Research Situations Where the Independent Samples f Test Is Used 181
5.2 A Hypothetical Research Example 182
5.3 Assumptions About the Distribution of Scores on the
Quantitative Dependent Variable 185
5.3.1 Quantitative, Approximately Normally Distributed 185
5.3.2 Equal Variances of Scores Across Groups
(the Homogeneity of Variance Assumption) 185
5.3.3 Independent Observations Both Between and Within Groups 186
5.3.4 Robustness to Violations of Assumptions 186
5.4 Preliminary Data Screening 188
5.5 Issues in Designing a Study 191
5.6 Formulas for the Independent Samples t Test 191
5.6.1 The Pooled Variances f Test 193
5.6.2 Computation of the Separate Variances t Test and Its Adjusted df 195
5.6.3 Evaluation of Statistical Significance of a t Ratio 195
5.6.4 Confidence Interval Around M, - M2
197
5.7 Conceptual Basis: Factors That Affect the Size of the t Ratio 197
5.7.1 Design Decisions That Affect the Difference Between
Group Means,Mt-M2
198
5.7.2 Design Decisions That Affect Pooled Within-Group Variance, 199
5.7.3 Design Decisions About Sample Sizes, n, and n2
200
5.7.4 Summary: Factors That Influence the Size of t 200
5.8 Effect Size Indexes for r 201
5.8.1 Eta Squared (if) 201
5.8.2 Cohen's d 202
5.8.3 Point Biserial r (rJ 202
5.9 Statistical Power and Decisions About Sample Size for
the Independent Samples t Test 203
5.10 Describing the Nature of the Outcome 205
5.11 SPSS Output and Model Results Section 206
5.12 Summary 209
Comprehension Questions 211
Chapter 6. One-Way Between-Subjects Analysis of Variance 215
6.1 Research Situations Where One-Way Between-Subjects
Analysis of Variance (ANOVA) Is Used 215
6.2 Hypothetical Research Example 217
6.3 Assumptions About Scores on the Dependent Variable
for One-Way Between-S ANOVA 217
6.4 Issues in Planning a Study 218
6.5 Data Screening 220
6.6 Partition of Scores Into Components 221
6.7 Computations for the One-Way Between-S ANOVA 225
6.7.1 Comparison Between the Independent Samples
t Test and One-Way Between-S ANOVA 225
6.7.2 Summarizing Information About Distances Between
Group Means: Computing MSbamm 227
6.7.3 Summarizing Information About Variability of
Scores Within Groups: Computing MS^^ 228
6.7.4 The F Ratio: Comparing MS^^ With MS^ 230
6.7.5 Patterns of Scores Related to the Magnitudes of MS^^ and AfSwilhin
231
6.7.6 Expected Value of F When H0
Is True 233
6.7.7 Confidence Intervals (CIs) for Group Means 234
6.8 Effect-Size Index for One-Way Between-S ANOVA 234
6.9 Statistical Power Analysis for One-Way Between-S ANOVA 235
6.10 Nature of Differences Among Group Means 236
6.10.1 Planned Contrasts 236
6.10.2 Post Hoc or "Protected" Tests 239
6.11 SPSS Output and Model Results 241
6.12 Summary 248
Comprehension Questions 251
Chapter 7. Bivariate Pearson Correlation 255
7.1 Research Situations Where Pearson r Is Used 255
7.2 Hypothetical Research Example 260
7.3 Assumptions for Pearson r 261
7.4 Preliminary Data Screening 264
7.5 Design Issues in Planning Correlation Research 269
7.6 Computation of Pearson r 269
7.7 Statistical Significance Tests for Pearson r 271
7.7.1 Testing the Hypothesis That pXY = 0 271
7.7.2 Testing Other Hypotheses About pXY 273
7.7.3 Assessing Differences Between Correlations 275
7.7.4 Reporting Many Correlations: Need to Control
Inflated Risk of Type I Error 277
7.7.4.1 Limiting the Number of Correlations 277
7.7.4.2 Cross-Validation of Correlations 278
7.7.4.3. Bonferroni Procedure: A More Conservative
Alpha Level for Tests of Individual Correlations 278
7.8 Setting Up CIs for Correlations 278
7.9 Factors That Influence the Magnitude and Sign of Pearson r 279
7.9.1 Pattern of Data Points in the X, Y Scatter Plot 279
7.9.2 Biased Sample Selection: Restricted Range or Extreme Groups 281
7.9.3 Correlations for Samples That Combine Groups 284
7.9.4 Control of Extraneous Variables 284
7.9.5 Disproportionate Influence by Bivariate Outliers 285
7.9.6 Shapes of Distributions of X and Y 287
7.9.7 Curvilinear Relations 290
7.9.8 Transformations of Data 290
7.9.9 Attenuation of Correlation Due to Unreliability of Measurement 291
7.9.10 Part-Whole Correlations 292
7.9.11 Aggregated Data 292
7.10 Pearson r and r
2
as Effect Size Indexes 292
7.11 Statistical Power and Sample Size for Correlation Studies 294
7.12 Interpretation of Outcomes for Pearson r 295
7.12.1 "Correlation Does Not Necessarily Imply Causation"
(So What Does It Imply?) 295
7.12.2 Interpretation of Significant Pearson r Values 296
7.12.3 Interpretation of a Nonsignificant Pearson r Value 297
7.13 SPSS Output and Model Results Write-Up 297
7.14 Summary 304
Comprehension Questions 305
Chapter 8. Alternative Correlation Coefficients 309
8.1 Correlations for Different Types of Variables 309
8.2 Two Research Examples 312
8.3 Correlations for Rank or Ordinal Scores 317
8.4 Correlations for True Dichotomies 318
8.4.1 Point Biserial r(r p b ) 319
8.4.2 Phi Coefficient (0) 321
8.5 Correlations for Artificially Dichotomized Variables 323
8.5.1 Biserial r(r b ) 323
8.5.2 Tetrachoric r(rtet) 324
8.6 Assumptions and Data Screening for Dichotomous Variables 324
8.7 Analysis of Data: Dog Ownership and Survival After a Heart Attack 325
8.8 Chi-Square Test of Association (Computational Methods for Tables of Any Size) 329
8.9 Other Measures of Association for Contingency Tables 329
8.10 SPSS Output and Model Results Write-Up 330
8.11 Summary 334
Comprehension Questions 335
Chapter 9. Bivariate Regression 338
9.1 Research Situations Where Bivariate Regression Is Used 338
9.2 A Research Example: Prediction of Salary From Years of Job Experience 340
9.3 Assumptions and Data Screening 342
9.4 Issues in Planning a Bivariate Regression Study 342
9.5 Formulas for Bivariate Regression 344
9.6 Statistical Significance Tests for Bivariate Regression 347
9.7 Setting Up Confidence Intervals Around Regression Coefficients 350
9.8 Factors That Influence the Magnitude and Sign of b 351
9.8.1 Factors That Affect the Size of the b Coefficient 352
9.8.2 Comparison of Coefficients for Different
Predictors or for Different Groups 352
9.9 Effect Size/Partition of Variance in Bivariate Regression 353
9.10 Statistical Power 356
9.11 Raw Score Versus Standard Score Versions of the Regression Equation 356
9.12 Removing the Influence of X From the Y Variable by
Looking at Residuals From Bivariate Regression 357
9.13 Empirical Example Using SPSS 358
9.13.1 Information to Report From a Bivariate Regression 365
9.14 Summary 369
Comprehension Questions 374
Chapter 10. Adding a Third Variable: Preliminary Exploratory Analyses 378
10.1 Three-Variable Research Situations 378
10.2 First Research Example 380
10.3 Exploratory Statistical Analyses for
Three-Variable Research Situations 381
10.4 Separate Analysis of Xv
Y Relationship for Each Level of
the Control Variable X2
382
10.5 Partial Correlation Between X, and Y, Controlling for X2
387
10.6 Understanding Partial Correlation as the Use of
Bivariate Regression to Remove Variance Predictable
by X2
From Both X, and Y 389
10.7 Computation of Partial r From Bivariate Pearson Correlations 390
10.8 Intuitive Approach to Understanding Partial r 394
10.9 Significance Tests, Confidence Intervals, and
Statistical Power for Partial Correlations 395
10.9.1 Statistical Significance of Partial r 395
10.9.2 Confidence Intervals for Partial r 395
10.9.3 Effect Size, Statistical Power, and
Sample Size Guidelines for Partial r 395
10.10 Interpretation of Various Outcomes for rY] 2 and rY1
396
10.11 Two-Variable Causal Models 399
10.12 Three-Variable Models: Some Possible Patterns
of Association Among X,, Y, and X2
401
10.12.1 X, and YAre Not Related Whether
You Control for X2
or Not 402
10.12.2 X2
Is Irrelevant to the X,, Y Relationship 403
10.12.3 When You Control for X2
, the X,, Y
Correlation Drops to 0 or Close to 0 403
10.12.3.1 Completely Spurious Correlation 404
10.12.3.2 Completely Mediated Association Between X, and Y 405
10.12.4 When You Control for X2
, the Correlation Between
X, and Y Becomes Smaller (but Does Not Drop
to 0 and Does Not Change Sign) 407
10.12.4.1 X2
Partly Accounts for the X,, Y
Association, or X, and X2
Are Correlated Predictors of Y 407
10.12.4.2X2
Partly Mediates theX,, YRelationship 408
10.12.5 When You Control for X2
, the X,, Y
Correlation Becomes Larger Than r1Y
or
Becomes Opposite in Sign Relative to r1Y
409
10.12.5.1 Suppression of Error Variance in a Predictor Variable 410
10.12.5.2 A Second Type of Suppression 413
10.12.6 "None of the Above" 414
10.13 Mediation Versus Moderation 415
10.13.1 Preliminary Analysis to Identify Possible Moderation 415
10.13.2 Preliminary Analysis to Detect Possible Mediation 417
10.13.3 Experimental Tests for Mediation Models 417
10.14 Model Results 418
10.15 Summary 419
Comprehension Questions 421
Chapter 11. Multiple Regression With Two Predictor Variables 423
11.1 Research Situations Involving Regression
With Two Predictor Variables 423
11.2 Hypothetical Research Example 425
11.3 Graphic Representation of Regression Plane 426
11.4 Semipartial (or "Part") Correlation 427
11.5 Graphic Representation of Partition of Variance in
Regression With Two Predictors 428