WOLFRAM|DEMONSTRATIONS PROJECT

Endogeneity Bias

​
parameters of true linear model
α
1
β
0.5
correlation between independent variable and error term
ρ
-0.76
true model
true model points
observed data points
OLS estimation
generate data
parameters of distributions
μ
10
σ
5
n
30
Endogeneity is one of the major concerns of contemporary empirical studies in economics and econometrics. This Demonstration aims to show the geometric sense of this phenomenon in the simplest setting, namely the model with one single explanatory variable (also known as the independent variable).
We use a population regression function [1] of the simple form
Y=α+βX
, where
α
and
β
are true parameters that are never known, to generate observable data of the form
Y
i
=α+β
X
i
+
ϵ
i
, where
ϵ
i
is the error term (or disturbance) for each simulated observation
X
i
. Simulated variation of the error term is controlled by the parameter
σ
. The purpose of fitting methods such as ordinary least squares (OLS) is to estimate true parameters of the model given observable data. A fitted model usually has the form

Y
=a+bX
(sample regression function [1]).
The parameter
ρ
is most important for this Demonstration. It is used to model the correlation between the vectors
X
and
ϵ
, that is, if
ρ=0
, then there is no covariance between the vectors; otherwise there is covariance that is maximal at
ρ=1
(or
ρ=-1
).