Suggested Errata

Introduction to Probability version dated 04/19/2023
​
Dear Marc,
​
While catching up this month, I came across a number of suggested errata; see the text below.

Thank you for putting together this course. It was a lot of fun as a fast track probability review.
​
Cheers,
​
Dave

2 Combinatorial Analysis

Course Notes Sections: “Combinations” and “Multinomial Coefficient”

The first sentence of the section”Combinations:
​
“Now consider you are choosing i out of n identical objects. What is the number of different groups of i objects?”
​
The text should be:
​
“Now consider you are choosing i out of n unique/ distinguishable objects. What is the number of different groups of i objects you can form?”
​
The change is relevant. We cannot create a combination of n identical objects i.e. of all repeating elements. Moreover, we can only use the binomial to calculate the number of combinations of n distinguishable (unique) elements i.e. with n non-repeating elements.
​
To see for ourselves in the WL where we will create permutations of the word “rest” versus “reset”, where we have no repeating elements and two repeating elements, resp.
In[]:=
Length@Permutations[Characters["rest"],{2}]​​Length@Permutations[Characters["reset"],{2}]
Out[]=
12
Out[]=
13
Idem for the section on Multinomial Coefficient, but here the change is more subtle. If the n elements are not all identical, the multinomial can be used to count the number of combinations with a number of repeating elements..
​
For example the multinomial can be used to compute the number of permutations of all letters in the word “reset” where we have two repeating elements, the letter “e” that is. In the WL we can see it for ourselves:
In[]:=
Length@Permutations@Characters["reset"]​​Multinomial@@Values@Counts@Characters["reset"]
Out[]=
60
Out[]=
60
Compare the results with a word without any repeating letters e.g. the word “rest”; the number of permutations of the letters is just Factorial[n].
In[]:=
Length@Permutations@Characters["rest"]
Out[]=
24

7 Random Variables

Exercise 2

The probability function in the question is not the same as the one used in the solution and should read:
p[x_]:=Piecewise
1
8
,x==8,
-2+Sqrt[4+
2
x
]
2
x
*Sqrt[5+
2
x
]
,x!=8

10 Properties of Random Variables

Exercise 3

The solution text should be:​​
E[
X
1
+
X
2
+
X
3
+
X
4
]
=
E[
X
1
]+E[
X
2
]+E[
X
3
]+E[
X
4
]
​Var[
X
1
+
X
2
+
X
3
+
X
4
] =
VAR[
X
1
]+VAR[
X
2
]+VAR[
X
3
]+VAR[
X
4
]

15 The Multinomial Distribution

Exercise 2

The cost function in the solution and question differ. The costs function in the question is: y1+2y2+3y3.

18 The Normal Distribution

Exercise 4

The text should mention “Height” instead of “Control Mass”.

Exercise 5

The term cube usually means to the power 3, what is meant here is “cube root”.

19 The Multinormal Distribution

Exercise 4

The weight is meant: WholeWeight.

21 Mixture Disitributions

Exercise 4

The question is unclear and needs an introduction to the data properties. The text could be:
We need the ResourceData “Sample Data: Myrtles” for which we select the “Points” properties to use for our analysis.
​
We want to create an estimated distribution of a mixture of 3 multi-normal distributions, also known as a Gaussian Mixture Model, or GMM. Next we will compare a contour plot of the GMM with a contour plot of the SmoothKernelDistribution. Can you guess what the data represents?

Exercise 5

Question is not very clear and could be:
We will look at the petal and sepal lengths and widths of Fisher’s Iris dataset. The Dataset can be found in the ResourceData.
◼
  • Create a mixture distribution of the sepal lengths and widths, where all three species are equally distributed. (Assume a multi-normal distribution.
  • ◼
  • Compute an estimated distribution of the sepal lengths and widths, aggregated over all species. (Assume a multi-normal distribution.)
  • ◼
  • Finally, create an empirical distribution, aggregated over all species using the SmoothKernelDistribution.
  • ◼
  • Create a contour plot of all three distributions.
  • 23 The Law of Large Numbers

    Exercise 1

    According to WikiPedia the Chebyshev inequality is:​P( X -μ ≥ kσ ) ⩽
    1
    2
    k
    ​​So the inequality in the course notes may not be correct. This seems to be in agreement with the solution of Exercise 1 where X-u >= k s, so if X-u=12 and s=6 then k should be 2.​Source: https://en.wikipedia.org/wiki/Chebyshev’s_inequality

    Exercise 3

    The text would be more accurate with this:
    ​
    “Suppose the lifetime variable X of an appliance has an exponential distribution with an average lifetime of 10 years. Compute an asymptotic approximation of order 3 for the expectation of ..”
    ​
    Maybe a clue to what value the parameter “a” (or alpha) is supposed to converge e.g “a->-1” would make the concept clearer. The example in the course notes where the AsymptoticProbability of order 3 is computed states that the parameter “b” converges to 1, so it’s not clear that we need to use -1 in this exercise.

    Exercise 4

    Text of the exercise is not clear. Is the experience described in a score from 1 to 10 or from 3 to 10? The solution suggests is should be 3 to 10.

    Exercise 5

    The text does not specifically state that we should generate random variables from each individual distribution, or from a composite distribution. The solution suggests the latter.
    ​
    Generate random variables from a combined distribution comprising of a Normal, Cauchy and StudentT distribution. The distributions’ average is 0 and the standard deviation random. The StudentTDistribution has 2 degrees of freedom.
    ​
    Plot the progression of the average of 1000 simulated samples. Compare the convergence rate to that of Exercise 4.

    24 Normal Approximations to the Binomial

    Example in Lesson Notes

    The text should follow the solution:
    ​
    You roll a fair six-sided die 180 times. What is the probability you roll a one at least 20 times and at least 30 times?
    ​
    The normal approximation is not correct if we want to replicate the binomial solution:
    dist=BinomialDistribution[180,1/6];​​Probability20-0.5<x<30+0.5,xNormalDistribution
    
    Out[]=
    0.521963
    Now the result is closer to the binomial one.

    Exercise 1

    The question is:
    ​
    What is the probability more than 100 and 300 or less adults sleep with a comfort object?
    ​
    So we should exclude 100 and include 300:
    In[]:=
    comfortDist=NormalDistribution[1000*1/3,Sqrt[1000*2/3*1/3]];​​Probability[100+0.5<x<=300+0.5,xcomfortDist]
    Out[]=
    0.0138141
    Not a big deal, but subtle difference (x ⩽ 300.5 instead of x < 300.5).

    Quiz 6

    Problem 6

    The only choice I could compute is in there, but’s not marked as correct. Is there (still) an issue with the correct answer in the framework?

    Practice Exam

    Problem 15

    The normal distribution is appropriate because the var ≥ 10, so we expect to see the normal approximation in the solution, but we do not see it in the solution.
    ​
    The solution should include:
    In[]:=
    n=500;​​p=0.049;​​var=np(1-p);​​Probability[20+0.5<x<30-0.5,xNormalDistribution[np,Sqrt[var]]]
    Out[]=
    0.646221

    Problem 16

    The answer is incorrect if we follow the solution.
    It should be 9.28606.