Wolfram Cloud Document

2.3 Law of Large Numbers

The Law of Large Numbers tells us that as

n→∞

the sample average defined by

∑

i=1

will be near the population average μ with a given probability. Given

samples from a population, we don’t expect X to exactly match μ. The Law of Large Numbers allows us to make a statement about the difference

-μ

. Specifically, the statement involves the probability that



-μ

is smaller than a certain value.

To answer the following questions, you do not need to understand the Law of Large Numbers. However, if you are interested, more formal definitions and proofs are given in

◼

Orloff and Bloom, Reading 6b

◼

Bulmer, Chapter 6

◼

DeGroot, Chapter 6

◼

Rozanov, p 69

Assigned tasks part a

Draw

n=100

values from a population of Gaussian-distributed number with mean

μ=0

and standard deviation

σ=1

In[]:=

take[n_]:=RandomVariate[NormalDistribution[],n]

Compute

In[]:=

=Mean[take[100]]

Out[]=

-0.140964

Repeat 1 and 2 10000 times and plot a histogram of

In[]:=

Histogram[tbl100=Table[Mean[take[100]],{10000}],PlotLabel->Column[{"

's of 10000 samples of size 100",Row[{"Mean[

] = ",Round[Mean[tbl100],.00001]," s = ",Round[StandardDeviation[tbl100],.00001]}]},Center],LabelStyle->13,AspectRatio->1/4,PlotRange->All,ImageSize->Large]

Out[]=

Assigned tasks part b

For

n=100

, what fraction of the 10000

s were in the range [-0.01,0.01]?

In[]:=

exact=

Count[Table[Mean[take[100]],{10000}],x_/;x<=.01]

10000

,N[exact]

Out[]=



369

5000

,0.0738

How does the fraction depend on

The fraction approaches 1 as

n->∞

. Since the samples are taken without replacement, there is no guarantee the fraction will reach 1 in finite

. But the Law of Large Numbers holds that it will become arbitrarily close. By that, I mean for any real error

ϵ>0

, no matter how small, the Law guarantees there will be some

N∈

beyond which all

±1

will be less than ϵ for all

n>N

In[]:=

tbl=Table

CountTable[Mean[take[

]],{10000}],x_/;x<=

100



10000

,{n,1,6};

In[]:=

ListLogLogPlotTable[Labeled[tbl[[i]],N[tbl[[i,2]]]],{i,1,Length[tbl]}],AspectRatio->

,Joined->True,PlotMarkers->Automatic,ImageSize->Large,LabelStyle->12,PlotRange->{0,1},PlotLabel->"Fraction of 10000 samples with

∈ 0 ± 0.01

",AxesLabel->{"Sample Size",""},ImagePadding->Automatic

Out[]=

For

n=100

, what is the range

[-ϵ,ϵ]

for which 99% of the 10000

s fall in?

In[]:=

Sort[(tbl100-Mean[tbl100])][[{51,-51}]]

Out[]=

{-0.258892,0.252217}

How does ϵ depend on

Itfollowsfromthanswertoprompt2abovethat

lim

n->∞

ϵ=0.

How does your answer change if the distribution changes (that is, if you draw values from a distribution other than Gaussian)?

Answer may be different. One assumption needed for CLT to apply is a finite mean and standard deviation. PDF

for

x>=1

, for example, has no finite mean or variance,

∞

∫

x=∞

. I assume other assumptions protect the theorem from distributions like mean weight of samples from an urn with 5 million objects of weight 1 and one of weight 5 million, even though that would have finite μ, σ.

590 students: Be prepared to discuss in class how this experiment is related to the Law of Large Numbers.

2.4 Central Limit Theorem

The Central limit Theorem says that for large n,

∑

i=1

is Gaussian-distributed with mean μ and standard deviation

. Important : this theorem (usually) applies even if the distribution of the values used in computing X are not Gaussian—distributed. With the Central Limit Theorem, we can make statements such as, “I took a sample of

values and computed

. If I took many samples and computed many

s, 95% of the time the range

±1.96

would include μ.”

Verify CLT

In the previous problem, you computed a histogram of 10000

s. Based on Central Limit Theorem,... Create one or two plots that demonstrate these points. Pay attention to your annotations. Save y our code as HW2_4. and plots as HW2_4.png (use subplots).

Take a large Poisson distribution

μ=2

, which is heavily skewed.

In[]:=

poisson=RandomVariate[PoissonDistribution[2],{1000000}];Histogrampoisson,

,PlotLabel->"Poisson μ = 2,

",AspectRatio->13,LabelStyle->13,ImageSize->Large

Out[]=

Then show distribution of

100

is normal with mean and standard deviation

≈2ands≈

100

In[]:=

means100=Table[Mean[RandomSample[poisson,100]],{10000}];ShowHistogram[means100,Automatic,"PDF",LabelStyle->13,PlotLabel->Row[{

100

Distribution

," μ = ",N[Mean[means100]]," σ = ",N[StandardDeviation[means100]]}]],PlotCalloutPDFNormalDistribution2,

100

,x,Row

100

,{2.2,2},{x,3/2,5/2},ImageSize->Large,AspectRatio->13

Out[]=

2.3 Law of Large Numbers

Assigned tasks part a

Assigned tasks part b

2.4 Central Limit Theorem

Verify CLT

Except when it doesn’t