Understanding Statistical Significance and Portfolio Diversification in Finance

Posted on Jan 21, 2025 in Mathematics

Understanding Statistical Significance in Regression Analysis

Recall that the probability of rejecting a correct null hypothesis is equal to the size of the test, denoted α. The possibility of rejecting a correct null hypothesis arises from the fact that test statistics are assumed to follow a random distribution, and hence they will take on extreme values that fall in the rejection region some of the time by chance alone. A consequence of this is that it will almost always be possible to find significant relationships between variables if enough variables are examined. The implication is that for any regression, if enough explanatory variables are employed, often one or more will be significant by chance alone. More concretely, it could be stated that if an α% size of test is used, on average one in every (100/α) regressions will have a significant slope coefficient by chance alone.

Trying many variables in a regression without basing the selection of the candidate variables on a financial or economic theory is known as data mining or data snooping. The result in such cases is that the true significance level will be considerably greater than the nominal significance level assumed. For example, suppose that 20 separate regressions are conducted, of which three contain a significant regressor, and a 5% nominal significance level is assumed, then the true significance level would be much higher (e.g., 25%). Therefore, if the researcher then shows only the results for the regression containing the final three equations and states that they are significant at the 5% level, inappropriate conclusions concerning the significance of the variables would result.

As well as ensuring that the selection of candidate regressors for inclusion in a model is made on the basis of financial or economic theory, another way to avoid data mining is by examining the forecast performance of the model in an out-of-sample data set (see chapter 5). The idea is essentially that a proportion of the data is not used in model estimation but is retained for model testing. A relationship observed in the estimation period that is purely the result of data mining, and is therefore spurious, is very unlikely to be repeated for the out-of-sample period. Therefore, models that are the product of data mining are likely to fit very poorly and to give very inaccurate forecasts for the out-of-sample period.

Jensen’s Alpha and Mutual Fund Performance

Jensen systematically tested the performance of mutual funds, and in particular, examined whether any beat the market. He used a sample of annual returns on the portfolios of 115 mutual funds from 1945-64. Each of the 115 funds was subjected to a separate OLS time series regression of the form:

R_jt − R_ft = α_j + β_j(R_mt − R_ft) + u_jt

where R_jt is the return on portfolio j at time t, R_ft is the return on a risk-free proxy (a 1-year government bond), R_mt is the return on a market portfolio proxy, u_jt is an error term, and α_j, β_j are parameters to be estimated. The quantity of interest is the significance of α_j, since this parameter defines whether the fund outperforms or underperforms the market index. Thus the null hypothesis is given by: H₀ : α_j = 0. A positive and significant α_j for a given fund would suggest that the fund is able to earn significant abnormal returns in excess of the market-required return for a fund of this given riskiness. This coefficient has become known as Jensen’s alpha.

Portfolio Diversification and Risk Reduction

R_p=w₁r₁+w₂r₂ ; w₁+w₂=1. Diversification reduces risks depending on the degree of correlation between the invested assets. If we invest in assets not correlated, we improve the ROI. If the correlation were 1, there would be perfect correlation and risk is not reduced. If it were -1, the return is completely canceled. Knowing the portfolio risk, we can obtain the minimum variance/risk formula for the portfolio.

Var(r_p)=w₁² var(r₁)+w₂² var(r₂) + 2w₁w₂ cov(r₁r₂)

min var: dvar(r_p)/dw₁=0

2w₁ var(r₁) + 2(1-w₁)(-1) var(r₂) 2cov(r₁r₂)-4w cov(r₁r₂)=0

w=(var(r₂)-cov(r₁r₂))/(var(r₁)+var(r₂)-2cov(r₁r₂))

Understanding Statistical Significance and Portfolio Diversification in Finance

Understanding Statistical Significance in Regression Analysis

Jensen’s Alpha and Mutual Fund Performance

Portfolio Diversification and Risk Reduction

Recent Notes

Subjects

Publicidad