Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Kinh tế ứng dụng_ Lecture 3: Outliers, Leverage and Influence docx
MIỄN PHÍ
Số trang
8
Kích thước
138.0 KB
Định dạng
PDF
Lượt xem
847

Tài liệu Kinh tế ứng dụng_ Lecture 3: Outliers, Leverage and Influence docx

Nội dung xem thử

Mô tả chi tiết

Applied Econometrics 1 Outliers, Leverage and Influence

Applied Econometrics

Lecture 3: Outliers, Leverage and Influence

‘Life is the art of drawing sufficient conclusions from insufficient premises’

SAMUEL BUTLER

1) Introduction

The estimates of the regression parameters are influenced by a few extreme observations. The

residual plot may let us pick out, which the individual data points are high or low. We may use the

residual plot to find the outlier, which are inadequately captured by the regression model itself.

2) Identification of outliers

¾ The percentiles that cut the data up into four quarters have special names: The 25th percentiles

and the 75th percentiles are called the lower and upper quartiles (QL and QU)

¾ The lower quartile will be the [integer((n+1)/2)+1]/2 value from the bottom of the ordered list.

the upper quartile is the [integer((n+1)/2)+1]/2 value from the top

¾ A data point Y0 is considered to be an outliers if

Y0 < QL – 1.5 IQR or Y0 > QU + 1.5 IQR

where IQR is the inter – quartile range (IQR = QU – QL) (Source: Hoaglin, 1983)

3) Outliers

An outlier is a point, which is far removed from its fitted value (i.e., has large residual). Large in this

context does not refer to the absolute size of a residual but to its size relative to most of the other

residuals in the regression.

When a point is an outlier in univariate analysis, it is defined with reference to its own mean. When

a point is an outlier in bivariate analysis, it has a large residual (i.e., Y value is far removed from its

fitted value).

Apart from the graphical methods, we can also rely on special statistics to detect outliers. In order to

compare the large residual to the other residual, we may calculate the standardized residual, which is

simply the residual divided by the standard error of the estimate (ei/s). But an outlier in the data set

will inflate the standard error of the regression. Hence we use the studentized residual

Written by Nguyen Hoang Bao May 20, 2004

Tải ngay đi em, còn do dự, trời tối mất!