Comparing Regression Models with and without Data Transformation

Constant for the Regression Model y

bx

e

Transformed Data Untransformed Data

b = 0.272202 b = 0.234158

This Demonstration shows the difference between regression models with and without data transformation. The transformed case estimates

b

by minimizing the sum of squared differences between

ln(y)

and

bx

. The untransformed case estimates

b

by minimizing the sum of squared differences between

y

and

bx

e

.

Details

In this Demonstration, we plot the regression model

y=

bx

e

to given data

(

x

1

,

y

1

),(

x

2

,

y

2

),.…,(

x

n

,

y

n

)

. To find the regression without transforming the data, we need to minimize the sum of the squares of the residuals

S

r

=

n

∑

i=1

2



y

i

-

b

x

i

e



.

To find

b

, we minimize

S

r

with respect to

b

. The value of

b

is hence given by solving the nonlinear equation

n

∑

i=1



y

i

x

i

b

x

i

e

-

x

i

2b

x

i

e

=0

. (1)

To avoid having to solve a nonlinear equation, we can transform the data and then use linear regression formulas to calculate

b

. In this case

y=

bx

e

,

ln(y)=bx

.

Then

b

is given by minimizing

S

r

=

n

∑

i=1

2

(ln(

y

i

)-b

x

i

)

b=

n

∑

i=1

x

i

ln(

y

i

)

n

∑

i=1

2

x

i

(2)

In this Demonstration, we show the regression model curves corresponding to values of

b

from equation (1) (untransformed) and equation (2) (transformed).

For more information, see A. Kaw, D. Nguyen, and E. Kalu, Numerical Methods with Applications, 2010.

External Links

Regression (Wolfram MathWorld)

Radial Velocity Curve Fitting

Permanent Citation

Vincent Shatlock, Autar Kaw

"Comparing Regression Models with and without Data Transformation"
http://demonstrations.wolfram.com/ComparingRegressionModelsWithAndWithoutDataTransformation/
Wolfram Demonstrations Project
Published: July 20, 2011