Measurement Error Models

Nội dung xem thử

Mô tả chi tiết

Measurement Error

Models

Measurement Error

Models

WAYNE A. FULLER

Iowa State University

Ames, Iowa

JOHN WILEY & SONS

New York Chichester Brisbane Toronto Singapore

A NOTE TO THE IEADER

This book has been electronically reproduced from

digital itiforniation stored at Jolui Wiley I% Sons, hic.

We are pleased that the use of this new technology

will enable 11s to keep works of enduring scholarly

value in print as long as there is a reasonable demand

for them. The content of this book is identical to

previous printings.

Reproduction or translation of any part of this work

beyond that permitted by Section 107 or 108 of the

1976 United States Copyright Act without the permission

of the copyright owner is unlawful. Requests for

permission or further information should be addressed to

the Permissions Department, John Wiley & Sons, Inc.

Library of Congress Cataloging in Publication Data:

Fuller, Wayne A.

Measurement error models.

(Wiley series in probability and mathematical

statistics. Applied probability and statistics,

ISSN 0271-6356)

Bibliography: p.

Includes index.

1. Error analysis (Mathematics) 2. Regression

analysis. I. Title. 11. Series.

QA275.FS5 1987

ISBN 0-471-86187- 1

Printed and bound in the United States of America by Braun-Brumfield, lnc.

109 8 7 6 5 4

To Doug and Bret

Preface

The study of regression models wherein the independent variables are measured with error predates the twentieth century. There has been a continuing

interest in the problem among statisticians and there is considerable literature on the subject. Also, for over 80 years, studies have documented the

presence of sizable measurement error in data collected from human respondents. Despite these two lines of research, only a fraction of the statistical

studies appearing in the literature use procedures designed for explanatory

variables measured with error.

This book is an outgrowth of research on the measurement error, also

called response error, in data collected from human respondents. The book

was written with the objective of increasing the use of statistical techniques

explicitly recognizing the presence of measurement error. To this end, a

number of real examples have been included in the text. An attempt has been

made to choose examples from a variety of areas of application, but the

reader will understand if many of the examples have an agricultural aspect.

The book may be used as a text for a graduate course concentrating on

statistical analyses in the presence of measurement error. It is hoped that

it will also find use as an auxiliary text in courses on statistical methodology

that heretofore have ignored, or given cursory treatment to, the problems

associated with measurement error. Chapter 1 was developed to provide an

introduction to techniques for a range of simple models. While the models of

Chapter 1 are special cases of models discussed in later chapters, it is felt

that the concepts are better communicated with the small models. There is

some flexibility in the order in which the material can be covered. One can

move from a section in Chapter 1 to the corresponding section in Chapter 2

or Chapter 4. To facilitate flexible use, Sections 1.2, 1.3, and 1.4 are largely

self-supporting. As a result, there is some duplication in the treatment of

topics such as prediction. Some repetition seems advantageous because the

vii

viii PREFACE

models of this book differ from those typically encountered by students in

courses on regression estimation.

The proofs of most of the theorems require a background of statistical

theory. One will be comfortable with the proofs only if one has an understanding of large sample theory. Also, the treatment assumes a background

in ordinary linear regression methods. In attempting to make the book useful to those interested in the methods, as well as to those interested in an

introduction to the theory, derivations are concentrated in the proofs of

theorems. It is hoped that the text material, the statements of the theorems,

and the examples will serve the person interested in applications.

Computer programs are required for any extensive application of the

methods of this book. Perhaps the most general program for normal distribution linear models is LISREL@ VI by Joreskog and Sorbom. LISREL

VI is available in SPSSXTM and can be used for a wide range of models of

the factor type. A program with similar capabilities, which can also perform

some least squares fitting of the type discussed in Section 4.2, is EQS

developed by Bentler. EQS is available from BMDP@ Statistical Software,

Inc. Dan Schnell has placed the procedures of Chapter 2 and Section 3.1

in a program for the IBM@ Personal Computer AT. This program, called

EV CARP, is available from the Statistical Laboratory, Iowa State University.

The packages SAP and BMDP contain algorithms for simple factor analysis.

A program, ISU Factor, written with Proc MATRIX of SAS by Sastry

Pantula, Department of Statistics, North Carolina State University, can be

used to estimate the factor model, to estimate multivariate models with

known error variances, and to estimate the covariance matrix of the factor

estimates. A program for nonlinear models, written with Proc MATRIX of

SAS by Dan Schnell, is available from Iowa State University.

I have been fortunate to work with a number of graduate students on

topics related to those of this text. Each has contributed to my understanding of the field, but none is to be held responsible for remaining

shortcomings. I express my sincere thanks to each of them. In chronological order they are James S. DeGracie, Angel Martinez-Garza, George E.

Battese, A. Ronald Gallant, Gordon D. Booth, Kirk M. Wolter, Michael A.

Hidiroglou, Randy Lee Carter, P. Fred Dahm, Fu-hua Yu, Ronald Mowers,

Yasuo Amemiya, Sastry Pantula, Tin-Chiu Chua, Hsien-Ming Hung, Daniel

Schnell, Stephen Miller, Nancy Hasabelnaby, Edina Miazaki, Neerchal

Nagaraj, and John Eltinge. I owe a particular debt to Yasuo Amemiya for

proofs of many theorems and for reading and repair of much of the manuLISREL is a registered trademark of Scientific Software, Inc. SPSS' is a trademark of SPSS,

Inc. BMDP is a registered trademark of BMDP Statistical Software, Inc. SAS is a registered

trademark of SAS Institute, Inc. IBM AT is a registered trademark of International Business

Machines, Inc.

PREFACE ix

script. I thank Sharon Loubert, Clifford Spiegelman, and Leonard Stefanski

for useful comments. I also express my appreciation to the United Kingdom

Science and Engineering Research Council and the U.S. Army European

Research Ofice for supporting the “Workshop on Functional and Structural

Relationships and Factor Analysis” held at Dundee, Scotland, August 24

through September 9, 1983. Material presented at that stimulating conference

had an influence on several sections of this book. I am grateful to Jane

Stowe, Jo Ann Hershey, and Christine Olson for repeated typings of the

manuscript. A part of the research for this book was supported by joint

statistical agreements with the United States Bureau of the Census and by

cooperative research agreements with the Statistical Reporting Service of the

United States Department of Agriculture.

WAYNE A. FULLER

Ames, Iowa

February 1987

Contents

List of Examples xv

List of Principal Results xix

List of Figures xxiii

1. A Single Explanatory Variable 1

1.1. Introduction, 1

1.1.1.

1.1.2.

1.1.3. Identification, 9

1.2.1. Introduction and Estimators, 13

1.2.2. Sampling Properties of the Estimators, 15

1.2.3. Estimation of True x Values, 20

1.2.4. Model Checks, 25

1.3. Ratio of Measurement Variances Known, 30

1.3.1. Introduction, 30

1.3.2. Method of Moments Estimators, 30

1.3.3. Least Squares Estimation, 36

1.3.4. Tests of Hypotheses for the Slope, 44

1.4. Instrumental Variable Estimation, 50

1.5. Factor Analysis, 59

1.6. Other Methods and Models, 72

1.6.1. Distributional Knowledge, 72

Ordinary Least Squares and Measurement Error, 1

Estimation with Known Reliability Ratio, 5

1.2. Measurement Variance Known, 13

xii CONTENTS

1.6.2. The Method of Grouping, 73

1.6.3. Measurement Error and Prediction, 74

1.6.4. Fixed Observed X, 79

Appendix 1 .A.

Appendix l.B.

Appendix l.C.

Appendix l.D.

Large Sample Approximations, 85

Moments of the Normal Distribution, 88

Central Limit Theorems for Sample Moments, 89

Notes on Notation, 95

2. Vector Explanatory Variables 100

2.1. Bounds for Coefficients, 100

2.2. The Model with an Error in the Equation, 103

2.2.1. Estimation of Slope Parameters, 103

2.2.2. Estimation of True Values, 113

2.2.3. Higher-Order Approximations for

Residuals and True Values, 118

2.3. The Model with No Error in the Equation, 124

2.3.1. The Functional Model, 124

2.3.2. The Structural Model, 139

2.3.3. Higher-Order Approximations for

Residuals and True Values, 140

2.4. Instrumental Variable Estimation, 148

2.5. Modifications to Improve Moment Properties, 163

2.5.1. An Error in the Equation, 164

2.5.2. No Error in the Equation, 173

2.5.3. Calibration, 177

Appendix 2.A. Language Evaluation Data, 18 1

3. Extensions of the Single Relation Model 185

3.1. Nonnormal Errors and Unequal Error Variances, 185

3.1.1. Introduction and Estimators, 186

3.1.2. Models with an Error in the Equation, 193

3.1.3. Reliability Ratios Known, 199

3.1.4. Error Variance Functionally Related to

Observations, 202

3.1.5. The Quadratic Model, 212

3.1.6. Maximum Likelihood Estimation for Known

Error Covariance Matrices. 217

CONTENTS xiii

3.2. Nonlinear Models with No Error in the Equation, 225

3.2.1. Introduction, 225

3.2.2. Models Linear in x, 226

3.2.3. Models Nonlinear in x, 229

3.2.4. Modifications of the Maximum Likelihood

Estimator, 247

3.3. The Nonlinear Model with an Error in the Equation, 261

3.3.1. The Structural Model, 261

3.3.2. General Explanatory Variables, 263

Measurement Error Correlated with True Value, 271

3.4.1. Introduction and Estimators, 271

3.4.2. Measurement Error Models for Multinomial

Random Variables, 272

3.4.

Appendix 3.A. Data for Examples, 281

4. Multivariate Models

4.1. The Classical Multivariate Model, 292

4.1.1. Maximum Likelihood Estimation, 292

4.1.2. Properties of Estimators, 303

Least Squares Estimation of the Parameters

of a Covariance Matrix, 321

4.2.1. Least Squares Estimation, 321

4.2.2. Relationships between Least Squares

and Maximum Likelihood, 333

4.2.3. Least Squares Estimation for the

Multivariate Functional Model, 338

4.2.

4.3. Factor Analysis, 350

4.3.1. Introduction and Model, 350

4.3.2. Maximum Likelihood Estimation, 353

4.3.3. Limiting Distribution of Factor Estimators, 360

Appendix 4.A. Matrix-Vector Operations, 382

Appendix 4.B.

Appendix 4.C.

Properties of Least Squares and Maximum

Likelihood Estimators, 396

Maximum Likelihood Estimation for

Singular Measurement Covariance, 404

Bibliography

Author Index

292

409

433

Subject Index 435

List of Examples

Number

1.2.1

1.2.2

1.2.3

1.3.1

1.3.2

1.3.3

1.4.1

1.5.1

1.5.2

1.6.1

1.6.2

2.2.1

2.2.2.

2.2.3

2.3.1

2.3.2

2.3.3

Topic

Corn-nitrogen. Error variance of explanatory variable known.

Estimates, 18

Corn-nitrogen. Estimated true values, 23

Corn-nitrogen. Residual plot, 26

Pheasants. Ratio of error variances known, 34

Rat spleens. Both error variances known, 40

Rat spleens. Tests and confidence intervals, 48

Earthquake magnitudes. Instrumental variable, 56

Corn hectares. Factor model, 63

Corn hectares. Standardized factors, 69

Corn-nitrogen. Prediction for random model, 75

Earthquakes. Prediction in another population, 77

Coop managers. Error variances estimated, 110

Coop managers. Estimated true values, 114

Corn-nitrogen. Variances of estimated true values and

residuals, 12 1

Apple trees. Estimated error covariance, 130

Corn-moisture experiment. Estimated error covariance, 134

Coop managers. Test for equation variance, 138

xvi LIST OF EXAMPLES

Rat spleens. Variances of estimated true values and

residuals, 142

Language evaluation. Instrumental variables, 154

Firm value. Instrumental variables, 158

Corn-nitrogen. Calibration, 179

Corn-nitrogen. Duplicate determinations used to estimate

error variance, 197

Farm size. Reliability ratios known, 201

Textiles. Different slopes in different groups, 204

Pig farrowings. Unequal error variances, 207

Tonga earthquakes. Quadratic model, 214

Quadratic. Both error variances known, 21 5

Supernova. Unequal error variances, 221

Created data. Linear in true values, 226

Berea sandstone. Nonlinear, 230

Berea sandstone. Nonlinear multivariate, 234

Hip prosthesis. Implicit nonlinear, 244

Quadratic, maximum likelihood. Large errors, 247

Pheasants. Alternative estimators of variance of estimated

slope, 255

Pheasants. Alternative form for estimated variance of

slope, 257

Quadratic. Large errors. Modified estimators, 257

Quadratic. Error in the equation. Weighted, 266

Moisture response model. Nonlinear, 268

Unemployment. Binomial, 275

Mixing fractions. Known error covariance matrix, 308

Cattle genetics. Error covariance matrix estimated, 3 13

Two earthquake samples. Least squares estimation, 325

Corn hectares. Estimation of linear model, 330

Corn hectares. Distribution-free variance estimation, 332

Earthquakes. Least squares iterated to maximum

likelihood, 337

2.3.4

2.4.1

2.4.2

2.5.1

3.1.1

3.1.2

3.1.3

3.1.4

3.1.5

3.1.6

3.1.7

3.2.1

3.2.2

3.2.3

3.2.4

3.2.5

3.2.6

3.2.7

3.2.8.

3.3.1

3.3.2

3.4.1

4.1.1

4.1.2

4.2.1

4.2.2

4.2.3

4.2.4

LIST OF EXAMPLES xvii

4.2.5

4.3.1

4.3.2

4.3.3

Corn hectares. Least squares estimation fixed and random

models, 344

Bekk smoothness. One factor, 364

Language evaluation. Two factors, 369

Language evaluation. Not identified, 374

List of Principal Results

Theorem

1.2.1

1.3.1

1.4.1

1.6.1

1 .A. 1

1 .c. 1

1.c.2

1.C.3

2.2.1

2.3.1

2.3.2

2.4.1

Topic

Approximate distribution of estimators for simple model

with error variance of explanatory variable known, 15

Approximate distribution of estimators for simple model

with ratio of error variances known, 32

Approximate distribution of instrumental variable estimators

for simple model, 53

Distribution of estimators when the observed explanatory

variable is controlled, 8 1

Large sample distribution of a function of sample

means, 85

Large sample distribution of first two sample moments, 89

Limiting distribution of sample second moments containing

fixed components, IID observations, 92

Limiting distribution of sample second moments containing

fixed components, independent observations, 94

Limiting distribution of estimators for vector model with

error covariance matrix of explanatory variables known, 108

Maximum likelihood estimators for vector model with no

error in the equation, 124

Limiting distribution of estimators for vector model with

no error in the equation. Limit for small error variances

and (or) large sample size, 127

Limiting distribution of instrumental variable estimator, 15 1

xix