Heteroskedasticity in Linear Regression Models
LM-Stat (for testing join significance of independent variables)
Heteroskedasticity
Other large sample tests: The Lagrange Multiplier Statistics to test join significance of independent variables
Consider the model:
Explain the procedure of LM-test to test the null hypothesis that and have no effect on once the other factors have been controlled for.
The null hypothesis: .
Estimate the restricted model: . Get the residuals .
Regress on . Get the .
Compute . Reject the null if at predetermined significance level.
Heteroskedasticity
Remember:
- Assumption MLR 5 (Homoskedasticity): or
- Sampling variances of the OLS slope estimators:
Under MLR 1 ~ MLR 5 , conditional on the sample values of ,
(5.1)
for , where is the total variation in and is the R-squared from regressing on all other independent variables (including an intercept).
- Equation (5.1) still have an unknown parameter . The unbiased estimator of is
(5.2)
The term is the degrees of freedom (df) for the general OLS problem with n observations and k independent variables (there are parameters in a regression model with k independent variables and an intercept).
- For conducting confidence interval and tests, we need to estimate the standard deviation of , which is just the square root of the :
(5.3)
Since is unknown, we replace it with its estimator . This gives the standard error of
(5.4)
- The standard error formula in (5.4) is not a valid estimator of if the errors exhibit heteroskedasticity.
- The presence of heteroskedasticity does not cause bias in the but it does lead to bias in the usual formula for , which then invalidates the standard errors.
- If we suspect heteroskedasticity, then the “usual” OLS standard errors are invalid and some corrective action should be taken.
Heteroskedasticity in OLS
- Homoskedasticity assumption for multiple regression: the variance of the unobservable error , conditional on the explanatory variables, is the same (= constant) for all combinations of outcomes of the explanatory variables.
- Homoskedasticity fails whenever the variance of the unobservables changes across different segment of the population. The segments are determined by the diffferent values of the explanatory variables.
- Example:
- Saving equation
Homoskedasticity is not constant if the variance of the unobserved factors affecting saving increases with income.
Homoskedasticty requires that the variance of the error does not depend on the levels of education, experience, or tenure. That is
- Homoskedasticity is needed to justify the usual t tests, F tests, and confidence intervals for OLS estimation of the linear regression model. In the presence of heteroskedasticity:
- The usual OLS t-statistics do not have t distributions in the presence of heteroskedasticity and the problem is not resolved by increasing sample size.
- Similarly, F statistics are no longer F distributed
- Also LM statistics no longer has an asymptotic chi square distribution.
- In summary, the statistics we used to test hypothesis under the Gauss-Markov assumptions are not valid in the presence of heteroskedasticity.
- If is not constant, OLS in longer BLUE. We will see it is possible to find estimators that are more efficient than OLS in the presence of heteroskedasticity (but it requires knowing the form of the heteroskedasticity).
Heteroskedasticity-robust procedure
- Heteroskedasticity-robust procedure: methods that remain valid (at least in large samples) whether or not the errors have constant variance.
- In the presence of heteroskedasticity, OLS estimates are still useful but the standard errors, t, F, LM statistics should be modified so that they are valid, at least asymptotically.
- Consider the model with a single independent variable
(5.5)
The first four Gauss-Markov assumptions hold but not MLR 5; the errors contain heteroskedasticity:
(5.6)
The OLS estimator for is still the same:
(5.7)
and can be written as function of population parameter
(5.8) .
Under assumption MLR 1 through MLR 4 (without the homoskedasticity assumption) we can show that
(5.9)
where is the total sum of squares of the .
Proof of (5.9):
, then
.
- When , , this formula reduces to the usual form, . Equation (5.9) explicitly shows that, for the simple regression case, the variance of OLS estimator formula derived under homoskedasticity is no longer valid when heteroskedasticity is present.
- White (1980) gives a valid estimator of , for heteroskedasticity of any form:
(5.10)
where denote the OLS residuals from the initial regression of on .
- For the general multiple regression model , a valid estimator for under assumption MLR 1 through MLR 4 is
(5.11)
where denotes the i th residual from regressing on all other independent variables and is the sum of squared residuals from this regression (Note: recall the partialling out representation of the OLS estimates).
- The square root of the quantity in (5.11) is called the heteroskedasticity-robust standard error for . In econometrics, these standard errors are usually attributed to White (1980).
- Once heteroskedasticity-robust standard errors are obtained, a heteroskedasticity-robust t-statistic can be computed using the standard formula
(5.12) .
- If the heteroskedasticity robust se are valid more often than the usual OLS se, why bother with the OLS se at all?
- One reason: if the homoskedasticity assumption holds and the errors are normally distributed, then the OLS t-statistic have exact t distributions regardless of n. The robust se and robust t-statistic are justified only as n becomes larger.
- With small sample sizes, the robust t-statistic can have distribution that are not very close to the t distribution so that we can not make inference.
- The usual F-statistic:
(5.13)
where is the sum of squared residuals from the unrestricted model, is the sum of squared residuals from the restricted model, is number of restrictions (i.e., number of explanatory variables dropped). Note that .
- Special case: most regression packages include a special F statistic for testing overall significance of the regression. In this case, is that none of the explanatory variables has an effect on y. This F-statistic is
(5.14)
Note that (5.14) is the special form of (5.13), i.e. (5.13) is more general.
- Equation (5.13) and (5.14) are no longer valid when heteroskedasticity presents (but the robust version has no simple form and will not be presented here).
- To test of multiple exclusion restrictions that is robust to heteroskedasticity, use heteroskedasticity-robust LM statistic.
- Consider the model
.
Suppose we would like to test
- The usual LM statistic can computed as follow (it only requires OLS regression):
(1) First estimate the restricted model (i.e., the model without and ), obtain the residuals ;
(2) Regress on all of the independent variables (including , and an intercept);
(3) Compute where is the R-squared from the regression of on x.
- The robust version of LM statistic involves some extra work:
(1) Obtain the residuals form the restricted model.
(2) Obtain residuals from the regression of on , , .
(3) Obtain residuals from the regression of on , , .
(Thus we regress each of the independent variables excluded under the null on all the included independent variables, and keep the residuals each time).
(4) Find the products between each and .
(5) Run the regression of 1 on , without an intercept.
(6) Compute where is the sum of squared residuals from regression in (5). Under , LM is distributed approximately as .
Testing for heteroskedasticity
- Focus on more modern tests: testing the assumption that the variance of the error does not depend on the independent variables.
The BP test for heteroskedasticity
- Consider the model
(5.15)
where assumptions MLR 1 through MLR 4 are maintained. In particular, we assume , so that OLS is unbiased and consistent.
- The null hypothesis relates to assumption MLR 5 to be true:
(5.16)
If we can not reject the at a sufficiently small significance level, we conclude that heteroskedasticity is not a problem.
- Since we assume a zero conditional expectation, then and so the null of hypothesis of homoskedasticity is equivalent to
(5.17)
This shows that, in order to test for violation of the homoskedasticity assumption, we want to test whether is related (in expected value) to one or more of the explanatory variables. If is false, the expected value of , given the independent variables, can be virtually any function of the .
- A simple approach is to assume a linear function:
(5.18)
where is an error term with mean zero given the .
- The null hypothesis of homoskedasticity is
(5.19)
- Then either the F or LM statistics for the overall (join) significance of the independent variables in explaining can be used to test (5.19).
- Since we do not know actual errors in the population model, we should use the OLS residuals and estimate the regression
(5.20)
- The F statistic and LM statistics both depend on the R-squared from regression (5.20) (which we call
).
- The F statistic is
(5.21)
which is distributed as under the null hypothesis of homoskedasticity.
- The LM statistic is
(5.22)
which is distributed as .
The LM version is called the Breusch-Pagan test for heteroskedasticity (BP test). This form is suggested by Koenker (1983).
- The BP tests for heteroskedasticity:
- Estimate the model by OLS. Obtain the squared OLS residuals.
- Run the auxiliary regression by OLS: squared OLS residuals on the independent variables
Keep the R-squared from this regression
- Form either the F-stat or the LM stat and compute the p-value. The distribution is for the former and for the latter.
- If the p-value is sufficiently small, we reject the null hypothesis of homoskedasticity.
The White test for heteroskedasticity
An alternative test for heteroskedasticity is to assume that under the null hypothesis the squared error is uncorrelated with all the independent variables , the squares of the independent variables , and all the cross-products . This is motivated by White (1980).
- When the model contains independent variables, the White test is based on an estimation of
- Compared with the Breusch Pagan test, this equation has six more regressors. The White test for heteroskedasticity is the LM statistic or F statistic (both tests have asymptotic justification) for testing that all of the coefficient delta in the equation are zero, except for the intercept.
- The White test uses many degrees of freedom for models with just a moderate number of independent variables. (E.g. with 6 independent variables in the model, the White regression would involve 27 regressors). Wooldridge proposed an alternative regression equivalent to the White regression but is more conserving on degrees of freedom.
- The (special case of) White tests for heteroskedasticity:
- Estimate the model by OLS. Obtain the OLS residuals and the fitted value . Compute the squared residuals and the squared fitted value .
- Run the auxiliary regression by OLS: squared OLS residuals on the independent variables
Note that is a function of all , while is a function for all the squares and cross products of .
Keep the R-squared from this regression
- Form either the F-stat or the LM stat for the null hypothesis and compute the p-value. The distribution is for the former and for the latter.
- If the p-value is sufficiently small, we reject the null hypothesis of homoskedasticity.
If heteroskedasticity is a problem, some corrective measure should be taken. One possibility is to just use the heteroskedasticity-robust standard errors and test statistics discussed previously. Another possibility is to use Weighted Least Squares Estimation.
Weighted least squares estimation
- When heteroskedasticity is present, OLS is still unbiased but no longer the most efficient. A more efficient estimator than OLS exists, and it produces t and F statistics that have t and F distributions.
- The caveat is that we must be very specific about the nature of the heteroskedasticity.
- Assume that
(5.23)
where is some function of the explanatory variables that determines the heteroskedasticity. Assume for a moment that the function is known (but is still unknown). For a random drawing, we can write
(5.24) .
Example: consider the simple saving function
(5.25)
(5.26) .
Here ; the variance of the error is proportional to the level of income. This means that as income increases, the variability in savings increases. The standard deviation of conditional on is .
Given a heteroskedasticity in (5.26), note that if we divide by we will have a new error with constant variance. Thus we can transformed (5.26) into
(5.27) or in more general term
(5.28)
- In this transformed equation, has a zero mean and a constant variance , conditional on .
.
- This means that if the original equation (5.25) satisfies the first four Gauss-Markov assumptions, then the transformed equation (5.28) satisfies all five Gauss-Markov assumptions.
- Thus, equation (5.28) can be estimated with OLS and we can simply use the resulted statistics for inference.
- The resulted estimates are examples of generalized least squares (GLS) estimators. The GLS estimators for correcting heteroskedasticity are called weighted least squares (WLS) estimators. In our example, the weight is .
- The idea of WLS is to give less weight to observations with a higher error variance. As comparison, OLS gives each observation the same weight. Mathematically, the WLS estimator are the values of that minimize
(5.29)
- Bringing the square root of inside the squared residuals, the weighted sum of squared residuals is identical to the sum of squared residuals in the transformed variables
(5.30)
The WLS estimators that minimizes (5.29) are simply the OLS estimators from (5.30).
- Most regression packages have a feature for computing WLS. We just specify, as usual, the dependent and independent variables, and the weighting function . That is, we specify weights proportional to the inverse of the variance, not proportional to the standard deviation.
What are the properties of WLS if our choice for is incorrect?
- WLS continues to be unbiased and consistent for estimating the .
- However, the reported standard errors, t-stat and F-stat are not valid if we incorrectly specify the form of heteroskedasticity.
- WLS is only guaranteed to be more efficient than OLS if we have correctly chosen the form of heteroskedasticity.
- When we are not certain about the form of heteroskedasticity:
- Use OLS and compute robust standard errors and test statistics