Chapter Exercises: Solutions

Chapter 1

  • Problem 1

We start with Bayes’ rule: \[\begin{align*} P(A \vert B) &= \frac{P(B \vert A)P(A)}{P(B)} \\ \Rightarrow ~~~ \frac{P(B \vert A)P(A)}{P(B)} &= \frac{P(A \vert B)P(A)}{P(B)} \\ \Rightarrow ~~~ 1 &= \frac{P(A)}{P(B)} \\ \Rightarrow ~~~ P(A) &= P(B) \,. \end{align*}\] Now let’s look at a 2 \(\times\) 2 table:

\(B\) \(\bar{B}\)
\(A\) \(P(A \cap B)\) \(P(A)-P(A \cap B)\) \(P(A)\)
\(\bar{A}\) \(P(A)-P(A \cap B)\) 0.2 \(1-P(A)\)
\(P(A)\) \(1-P(A)\) \(1\)

So we have that \[\begin{align*} P(A \cap B) + 2 [ P(A)-P(A \cap B) ] + 0.2 &= 1 \\ \Rightarrow ~~~ P(A \cap B) + 2 [ P(A)-P(A \cap B) ] &= 0.8 \\ \Rightarrow ~~~ 2 P(A) - P(A \cap B) &= 0.8 \\ \Rightarrow ~~~ P(A \cap B) &= 2 P(A) - 0.8 \,. \end{align*}\]


  • Problem 2

We are given that \(A \subset B\), so \[\begin{align*} P(A \vert B)= \frac{P(B \cap A)}{P(B)} = \frac{P(A)}{P(B)} \neq P(A) \,. \end{align*}\]


  • Problem 3

We have that \[\begin{align*} P(A \cap B| A\cup B) = \frac{P[(A \cap B) \cap (A \cup B)]}{P(A\cup B)} = \frac{P(A \cap B)}{P(A\cup B)} = \frac{1 - P(\bar{A} \cup \bar{B})}{1 - P(\bar{A} \cap \bar{B})} \,. \end{align*}\]


  • Problem 4

(a) We have that \[\begin{align*} P(B|A) + P(\bar{B}|A) &= 1 \\ \Rightarrow ~~~ P(B|A) &= \frac{1}{4} = \frac{1}{3}P(\bar{B}|A) \\ \Rightarrow ~~~ \frac{P(A \cap B)}{P(A)} &= \frac{1}{3} \frac{P(A \cap \bar{B})}{P[A]} \\ \Rightarrow ~~~ P(A \cap B) &= x = \frac{1}{3} \frac{1}{6} = \frac{1}{18} \,. \end{align*}\]

(b) Here, we utilize a \(2 \times 2\) table:

\(B\) \(\bar{\text{B}}\)
\(A\) 1/18 1/6 4/18
\(\bar{\text{A}}\) 1/3 14/18
1/2 1/2

Thus \[\begin{align*} p(\bar{A} \cap B) = \frac{14}{18} - \frac{6}{18} = \frac{8}{18} \end{align*}\] and \[\begin{align*} P(B|\bar{A}) = \frac{\bar{A} \cap B}{P(\bar{A})} = \frac{8/18}{14/18} = \frac{8}{14} = \frac{4}{7} \,. \end{align*}\]


  • Problem 5

One way to approach this problem is to write \[\begin{align*} P(\bar{A}|\bar{B}) = \frac{P(\bar{A}\cap\bar{B})}{P(\bar{B})} = \frac{1 - P(A \cup B)}{1 - P(B)} \,; \end{align*}\] equivalently, we can write \[\begin{align*} P(\bar{A}|\bar{B}) = 1 - P(A|\bar{B}) = 1 - \frac{P(\bar{B}|A)P(A)}{P(\bar{B})} = 1 - \frac{\left[1 - P(B|A)\right]P(A)}{1 - P(B)} \,. \end{align*}\]


  • Problem 6

Using a decision tree:

  • \(P(W_1) = 2/5\) and \(P(B_2 \vert W_1) = 3/4\), so \(P(B_2 \cap W_1) = 3/10\); and
  • \(P(B_1) = 3/5\) and \(P(B_2 \vert B_1) = 1/2\), so \(P(B_2 \cap B_1) = 3/10\).

Thus, by adding the probabilities of these disjoint events, we determine that \(P(B_2) = P(B_2 \cap W_1) + P(B_2 \cap B_1) = 3/5\).


  • Problem 7

If we denote the event of knowing the answer as \(K\), \(P(K) = 0.6\) and \(P(\bar{K}) = 0.4\), and if we denote the event of getting the question correct as \(C\), we know that \(P(C \vert \bar{K}) = 0.2\). It is implicit in the wording of the question that \(P(C \vert K) = 1\). Ultimately, we want to determine \(P(K \vert C)\): \[\begin{align*} P(K \vert C) = \frac{P(C \vert K)P(K)}{P(C)} = \frac{P(C \vert K)P(K)}{P(C \vert K)P(K)+P(C \vert \bar{K})P(\bar{K})} = \frac{1 \cdot 0.6}{1 \cdot 0.6 + 0.2 \cdot 0.4} = \frac{60}{68} = \frac{15}{17} \,. \end{align*}\]


  • Problem 8

We are given that \(P(S) = 0.15\), \(P(D \vert S) = 10 \cdot P(D \vert \bar{S})\), and \(P(D) = 0.01\). So \[\begin{align*} P(S \vert D) &= \frac{P(D\vert S)P(S)}{P(D)} = \frac{P(D \vert S)P(S)}{P(D \vert S)P(S) + P(D \vert \bar{S})P(\bar{S})} \\ &= \frac{P(D \vert S)P(S)}{P(D \vert S)P(S) + P(D \vert S)(1 - P(S))/10} \\ &= \frac{P(S)}{P(S)+1/10-P(S)/10} = \cdots = 30/47 \,. \end{align*}\]


  • Problem 9

The compound event of winning is \(W = \{ 1\cap3, 1\cap1\cap2, 1\cap1\cap3, \ldots \}\). Let the event \(S\) denote rolling a 1, with \(P(S) = 1/3\), and the event \(F\) denote rolling a 2 or a 3, with \(P(F) = 2/3\). Then \[\begin{align*} P(W) &= P(1\cap3) + P(1\cap1\cap2) + \ldots = \\ &= \frac{1}{2} P(S) P(F) + P(S)^2 P(F) + P(S)^3 P(F) + \ldots \\ &= \frac{1}{2}\frac{1}{3}\frac{2}{3} + P(S)^2 P(F) \left(1 + P(S) + \ldots \right)\\ & = \frac{1}{9} + P(S)^2 P(F) \frac{1}{1 - P(S)} = \frac{1}{9} + \frac{P(S)^2 P(F) }{P(F)} = \frac{1}{9} + P(S^2) = \frac{2}{9} \,. \end{align*}\] Alternatively, we can define compound event of losing, \(L = \{2, 3, 1\cap2 \}\), and write that \[\begin{align*} P(L) &= \frac{1}{3} + \frac{1}{3} + \left(\frac{1}{3}\right)^2 = \frac{7}{9} \\ \Rightarrow ~~~ P(W) &= 1 - P(L) = 1 - \frac{7}{9} = \frac{2}{9} \,. \end{align*}\]


  • Problem 10

There are multiple ways to do this. Here is one: \[\begin{align*} P(H_1 \cup T_2) &= P(H_1) + P(T_2) - P(H_1 \cap T_2) = P(H_1) + P(T_2) - P(T_2 | H_1) P(H_1)\\ &= P(H_1) + P(T_2|H_1)P(H_1) + P(T_2|T_1)P(T_1) - P(T_2 | H_1) P(H_1)\\ &= P(H_1) + P(T_2|T_1)P(T_1) = 0.5 + 0.4 \cdot 0.5 = 0.7 \,. \end{align*}\] Note: \(P(T_2|T_1) = 1 - P(H_2|T_1)= 0.4\), since \(H_2\) is just \(\bar{T}_2\).


  • Problem 11

Let \(N\) be the event that a sample contains nitrates and \(R\) be the event that the sample burns red. We are given that \(P(N) = 0.2\), \(P(R|N) = 0.9\), and \(P(R|\bar{N}) = 0.15\).

(a) We seek \(P(R)\), which is \[\begin{align*} P(R|N)P(N) + P(R|\bar{N})P(\bar{N}) = 0.9 \cdot 0.2 + 0.15 \cdot (1-0.2) = 0.30 \,. \end{align*}\]

(b) We seek \(P(N|R)\), which is \[\begin{align*} \frac{P(R|N)P(N)}{P(R)} = \frac{0.9 \cdot 0.2}{0.3} = 0.60 \,. \end{align*}\]


  • Problem 12

(a) Let \(M\), \(V\), and \(O\) denote the event of flying on a major airline, private plane, or other aircraft, respectively, and let \(B\) denote the event of being a business traveler. The Law of Total Probability tells us that \[\begin{align*} P(B) &= P(B \vert M) P(M) + P(B \vert V)P(V)+ P(B \vert O)P(O) = 0.5\cdot 0.6 + 0.6\cdot 0.3 + 0.9 \cdot 0.1 = 0.57 \,. \end{align*}\]

(b) We use Bayes’ rule to determine that \[\begin{align*} P(V \vert B) &= \frac{P(B \vert V)P(V)}{P(B)} = \frac{0.6 \cdot 0.3}{0.57} = \frac{6}{19} \,. \end{align*}\]


  • Problem 13

We can use, e.g., a decision tree to determine that the probability mass function for \(X\) is

\(x\) \(p_X(x)\)
1 1/2
2 1/2 \(\cdot\) 2/3 = 1/3
3 1/2 \(\cdot\) 1/3 = 1/6

(a) The expected value is \[\begin{align*} E[X] = \sum_{x=1}^3 x p_X(x) = 1 \cdot \frac{1}{2} + 2 \frac{1}{3} + 3 \frac{1}{6} = \frac{5}{3} \,. \end{align*}\]

(b) We use the shortcut formula to find the variance: \[\begin{align*} E[X^2] &= \sum_{x=1}^3 x^2 p_X(x) = 1 \cdot \frac{1}{2} + 4 \frac{1}{3} + 9 \frac{1}{6} = \frac{10}{3} \\ \Rightarrow ~~~ V[X] &= E[X^2] - E[X]^2 = \frac{10}{3}- \left(\frac{5}{3}\right)^2 = \frac{5}{9} \,. \end{align*}\]


  • Problem 14

(a) Let \(W_2\) denote the event of observing two white balls, and let 1, 2, and 3 denote the event of choosing Bowl 1, 2, and 3, respectively. We apply the Law of Total Probability here: \[\begin{align*} P(W_2) &= P(W_2 \vert 1)P(1) + P(W_2 \vert 2)P(2)+ P(W_2 \vert 3)P(3) \\ &= 0 \cdot 1/3 + P(W_2 \vert 2) \cdot 1/3 + 1 \cdot 1/3 \,. \end{align*}\] Bowl 2 has 2 white and 1 black balls. Thus \(P(W_2 \vert 2) = 2/3 \cdot 1/2 = 1/3\), since there’s a 2/3 chance of drawing a white ball first, and conditional on that a 1/2 chance of drawing a white ball second. So \[\begin{align*} P(W_2) = 1/3 \cdot 1/3 + 1/3 = 4/9 \,. \end{align*}\]

(b) We seek \(P(3 \vert W_2)\). We apply Bayes’ rule here: \[\begin{align*} P(3 \vert W_2) = \frac{P(W_2 \vert 3)P(3)}{P(W_2)} = \frac{1 \cdot 1/3}{4/9} = 3/4 \,. \end{align*}\]


  • Problem 15

(a) We are given that \(P(A) = 0.8\), \(P(\bar{A}) = 0.2\), \(P(F \vert A) = 0.2\), and \(P(F \vert \bar{A}) = 0.1\). Using the Law of Total Probability, we find that \[\begin{align*} P(F) = P(F \vert A)P(A) + P(F \vert \bar{A})P(\bar{A}) = 0.2 \cdot 0.8 + 0.1 \cdot 0.2 = 0.18 \,. \end{align*}\]

(b) We seek \(P(A \vert F)\). Using Bayes’ rule, \[\begin{align*} P(A \vert F) = \frac{P(F \vert A)P(A)}{P(F)} = \frac{0.2 \cdot 0.8}{0.18} = \frac{0.16}{0.18} = \frac{8}{9} \,. \end{align*}\]


  • Problem 16

We know that \[\begin{align*} E[X] = \sum x p_X(x) = 0 \cdot x + 1 \cdot 0.5 + 2 \cdot x + 3 \cdot 0.3 = 0.5 + 2x + 0.9 = 1.6 \,. \end{align*}\] Thus \(p_X(2) = 0.1\). Since \(\sum p_X(x) = 1\), we also know \(p_X(0) = 0.1\). Next, we compute \(E[X^2]\): \[\begin{align*} E[X^2] = \sum x^2 p_X(x) = 1 \cdot 0.5 + 4 \cdot 0.1 + 9 \cdot 0.3 = 3.6 \,. \end{align*}\] So \(V[X] = 3.6 - (1.6)^2 = 3.6 - 2.56 = 1.04\), and \(\sigma \approx 1.02\). Thus \[\begin{align*} P(\mu - \sigma \leq X \leq \mu+\sigma) &= P(1.6-1.02 \leq X \leq 1.6+1.02) = P(0.58 \leq X \leq 2.62) \\ &= p_X(1) + p_X(2) = 0.5 + 0.1 = 0.6 \,. \end{align*}\]


  • Problem 17

Note that one can derive the answer without ever determining the value of \(c\): \[\begin{align*} P(X > 1.25 | X < 1.5) &= \frac{P(X > 1.25 \cap X < 1.5)}{P(X < 1.5)} = \frac{P(1.25 < X < 1.5)}{P(X < 1.5)} \\ &= \frac{\int_{5/4}^{3/2}\frac{c}{x^2}dx}{\int_{1}^{3/2}{\frac{c}{x^2}}dx} = \frac{-x^{-1}|_{5/4}^{3/2}}{-x^{-1}|_{1}^{3/2}} = \frac{(4/5 - 2/3)}{(1-2/3)} = \frac{12/15 - 10/15}{5/15} = \frac{2}{5} \,. \end{align*}\]


  • Problem 18

(a) Later, we will recognize that this is a beta distribution and we thus would be able to use its properties to derive, e.g., a value for \(c\). In the meantime, we will apply brute-force integration: \[\begin{align*} \frac{1}{c} &= \left( \int_0^1 x^2 dx - \int_0^1 x^3 \right) = \left. \frac{x^3}{3} \right|_0^1 - \left. \frac{x^4}{4} \right|_0^1 = \frac13 - \frac14 = \frac{1}{12} \,. \end{align*}\] Thus \(c = 12\).

(b) We have that \[\begin{align*} P(X > 1/2) &= 12 \int_{1/2}^1 x^2(1-x) dx = 12 \left[ \frac{x^3}{3}\bigg|_{1/2}^1 - \frac{x^4}{4}\bigg|_{1/2}^1 \right] = \left(4 - \frac{1}{2}\right) - \left(3 - \frac{3}{16}\right) = \frac{11}{16} \,. \end{align*}\]

(c) The expected value is \[\begin{align*} E[X] &= 12 \left( \int_0^1 x^3 dx - \int_0^1 x^4 \right) = 12 \left. \frac{x^4}{4} \right|_0^1 - \left. \frac{x^5}{5} \right|_0^1 = 12 \left( \frac14 - \frac15 \right) = \frac{12}{20} = \frac35 \,. \end{align*}\]


  • Problem 19

The following 2 \(\times\) 2 table shows the possible outcomes of the experiment, i.e., the values of \(X = \vert Y_2 - Y_1 \vert\):

\(Y_2 = 0\) \(Y_2 = 1\)
\(Y_1 = 0\) 0 1
\(Y_1 = 1\) 1 0

Each outcome is equally likely. Hence \(P(X = 0) = 1/2\) and \(P(X=1) = 1/2\).

(a) We seek \(\sigma = \sqrt{V[X]} = \sqrt{E[X^2]-(E[X])^2}\): \[\begin{align*} E[X] &= 0 \cdot 1/2 + 1 \cdot 1/2 = 1/2~~\mbox{and}\\ E[X^2] &= 0^2 \cdot 1/2 + 1^2 \cdot 1/2 = 1/2 \,. \end{align*}\] So \(\sigma = \sqrt{1/2-1/4} = 1/2\).

(b) The expected value is \(\mu = E[X] = 1/2\), while the standard deviation is \(\sigma = 1/2\), so ultimately we are asking for \(P(1/2-1/2 < X < 1/2+1/2) = P(0 < X < 1)\)…which equals zero because there are no probability masses between 0 and 1 exclusive.


  • Problem 20

(a) The mean of the distribution is \[\begin{align*} E[X] = \int_0^1 x f_X(x) dx = \int_0^1 3x^3 dx = \left. \frac{3}{4}x^4 \right|_0^1 = \frac{3}{4} \,. \end{align*}\]

(b) We first compute \(E[X^2]\): \[\begin{align*} E[X^2] = \int_0^1 x^2 f_X(x) dx = \int_0^1 3x^4 dx = \left. \frac{3}{5}x^5 \right|_0^1 = \frac{3}{5} \,. \end{align*}\] We then apply the shortcut formula to determine the variance: \[\begin{align*} V[X] = E[X^2] - (E[X])^2 = \frac{3}{5} - \left(\frac34\right)^2 = \frac{3}{80} \,. \end{align*}\]

(c) The expected value of the difference is \[\begin{align*} E[2X_1 - 2X_2] = 2E[X_1] - 2E[X_2] = 2 \cdot \frac34 - 2 \cdot \frac34 = 0 \,. \end{align*}\]

(d) The variance of the difference is \[\begin{align*} V[2X_1 - 2X_2] = 4V[X_1] + 4V[X_2] = 4\left(\frac{3}{80} + \frac{3}{80}\right) = \frac{3}{10} \,. \end{align*}\]


  • Problem 21

(a) We are dealing with a continuous random variable. The pdf is thus \[\begin{align*} f_X(x) = \frac{d}{dx}F_X(x) = 3x^2 ~~~~~~ x \in [0,1] \,. \end{align*}\]

(b) Given \(f_X(x) = 3x^2\) from part (a), we can compute \(V[X]\): \[\begin{align*} V[X] &= E[X^2] - (E[X])^2 \\ &= \int_0^1 x^2 f_X(x) dx - \left[\int_0^1 x f_X(x) dx\right]^2 = \int_0^1 3 x^4 dx - \left[\int_0^1 3 x^3 dx\right]^2 \\ &= \left.\frac{3x^5}{5}\right|_0^1 - \left[ \left.\frac{3x^4}{4}\right|_0^1 \right]^2 = \frac{3}{5} - \left( \frac{3}{4} \right)^2 = \frac{48}{80} - \frac{45}{80} = \frac{3}{80} \,. \end{align*}\]

(c) We seek \(P\left(X < \frac{1}{2} \vert X > \frac{1}{4}\right)\): \[\begin{align*} P\left(X < \frac{1}{2} \vert X > \frac{1}{4}\right) &= \frac{P\left(X < \frac{1}{2} \cap X > \frac{1}{4}\right)}{P\left(X > \frac{1}{4}\right)} = \frac{P\left(\frac{1}{4} < X < \frac{1}{2}\right)}{P\left(X > \frac{1}{4}\right)} \\ &= \frac{F_X\left(\frac{1}{2}\right)-F_X\left(\frac{1}{4}\right)}{1 - F_X\left(\frac{1}{4}\right)} = \frac{\left(\frac{1}{2}\right)^3 - \left(\frac{1}{4}\right)^3}{1 - \left(\frac{1}{4}\right)^3} = \frac{\frac{1}{8} - \frac{1}{64}}{1 - \frac{1}{64}} = \frac{\frac{7}{64}}{\frac{63}{64}} = \frac{1}{9} \,. \end{align*}\]


  • Problem 22

First, we write down the pmf:

\(x\) 0 1 3
\(p_X(x)\) 0.5 0.25 0.25

Second, we compute \(E[X]\) and \(E[X^2]\): \[\begin{align*} E[X] &= \sum_{x=-a}^{a} x p_X(x) = 0 \cdot 0.5 + 1\cdot 0.25 + 3 \cdot 0.25 = 1 \,. \\ E[X^2] &= \sum_{x=-a}^{a} x^2 p_X(x) = 0 \cdot 0.5 + 1\cdot 0.25 + 3^2 \cdot 0.25 = 2.5 \,. \end{align*}\] Thus \(V[X] = E[X^2] - E[X]^2 = 2.5 - 1 = 1.5\).


  • Problem 23

(a) We know that \(F_X(1) = 1\), so \(c(1^2) = 1 ~~~ \Rightarrow ~~~ c = 1\).

(b) Given that \(f_X(x) = 2x\), for \(0 \leq x \leq 1\), \[\begin{align*} E[X] = \int_0^1 x 2x dx = \frac{2}{3}x^3\bigg|_0^1 = \frac{2}{3} \,. \end{align*}\]

(c) We utilize the shortcut formula to determine the standard deviation: \[\begin{align*} E[X^2] &= \int_0^1 x^2 2x dx = \frac{2}{4}x^4\bigg|_0^1 = \frac{1}{2} \\ V[X] &= E[X^2] - E[X]^2 = \frac{1}{2} - \frac{4}{9} = \frac{1}{18} \\ \Rightarrow ~~~ \sigma &= \sqrt{V[X]} = \sqrt{\frac{1}{18}} = 0.236 \,. \end{align*}\]


  • Problem 24

(a) If \(b\) is larger \(b\), then \(a\) is smaller. Given that the minimum value of \(a\) for a valid pdf is zero, we can determine that \[\begin{align*} \int_1^2 bx dx = 1 = b \frac{x^2}{1}\bigg|_1^2 = b\left(\frac{4}{2} - \frac{1}{2} \right) ~~~ \Rightarrow ~~~ b = \frac{2}{3} \,. \end{align*}\]

(b) We have that \[\begin{align*} F_X(x) = \left\{ \begin{array}{cc} 0 & x \leq 0 \\ \int_0^xa dz = az\bigg|_0^x = ax = \frac{1}{2}x & x \in (0,1] \\ \int_0^1 a dz + \int_1^xbz dz = F_X(1) + b\frac{z^2}{2}\bigg|_0^x = \frac{1}{2} + \frac{1}{6}(x^2 - 1) & x \in (1,2] \\ 1 & x > 2 \end{array} \right. \,. \end{align*}\]


  • Problem 25

(a) \(E[X] = 2 = 1(0.4) + y (0.2) ~~~ \Rightarrow ~~~ y = 8\).

(b) \(E[X] = 1(0.4) + 3 (0.2) = 1\) and \(E[X^2] = 1(0.4) + 9 (0.2) = 2.2\). Therefore \[\begin{align*} \sigma = \sqrt{E[X^2] - E[X]^2} = \sqrt{1.2} \,. \end{align*}\]

(c) \(P(\mu - \sigma \leq X \leq \mu + \sigma) = P(1 -\sqrt{1.2} \leq X \leq 1 +\sqrt{1.2} ) = p_X(0) + p_X(1) = 0.8\).


  • Problem 26

(a) \(\int_0^1 c \, dx = 1 - 0.2 - 0.2 = 0.6 = cx\bigg|_0^1 = c \, \Rightarrow \, c = 0.6\).

(b) \(E[X] = (-1)(0.2) + \int_0^1 0.6x dx + 2 (0.2) = 0.2 + 0.3x^2|_0^1 = 0.5\).

(c) The cdf is constantly \(0\) before \(-1\); constantly \(0.2\) between \([-1,0)\); a line that connects these two points \((0, 0.2) - (1,0.8)\) between \([0,1)\); constantly \(0.8\) between \([1,2)\); then constantly \(1\) after \(2\).


  • Problem 27

(a) \(F_X(x) = \int_0^x f_Z(z) dz =\int_0^x e^{-z} dz = -e^{-z} \bigg|_0^x = 1-e^{-x}\).

(b) \(P(X > 1) = 1 - F_X(1) = 1 - (1 - e^{-1}) = e^{-1}\).

(c) We have that \[\begin{align*} P(X < 1/2 \cup X>2) &= F_X(1/2) + (1 - F_X(2)) \\ &= 1 - e^{-1/2} + (1 - (1 - e^{-2})) = 1 - e{-1/2} + e^{-2} \,. \end{align*}\]

(d) We have that \[\begin{align*} P(X < 2 | X>1) = \frac{P(1 < X < 2)}{P(X>1)} = \frac{F_X(2) - F_X(1)}{1 - F_X(1)} = \frac{(1 - e^{-2}) - (1 - e^{-1})}{1 - (1 - e^{-1})} =1 - e^{-1} \,. \end{align*}\]


  • Problem 28

(a) The pdf is symmetric around zero \(\Rightarrow E[X] = 0\): \[\begin{align*} E[X] = \int_{-1}^0 -x^2 dx + \int_0^1 x^2 dx = - \frac{x^3}{3}\bigg|_{-1}^0 + \frac{x^3}{3}\bigg|_{0}^1 = -\frac{1}{3} + \frac{1}{3} = 0 \,. \end{align*}\]

(b) The variance is \[\begin{align*} V[X] = E[X^2] - E[X]^2 = \int_{-1}^0 -x^3 dx + \int_0^1 x^3 dx = - \frac{x^4}{3}\bigg|_{-1}^0 + \frac{x^4}{4}\bigg|_{0}^1 + \frac{x^4}{4}\bigg|_{0}^1 = \frac{1}{4} + \frac{1}{4} = \frac{1}{2} \,. \end{align*}\]

(c) We have that \(F_X(0) = \frac{1}{2}\)…thus: \[\begin{align*} F_X(x) \text{ for } x\in\left[0,1\right] = \frac{1}{2} + \int_0^x z dz = \frac{1}{2} + \frac{z^2}{2}\bigg|_0^x = \frac{1}{2} + \frac{x^2}{2} \,. \end{align*}\]


  • Problem 29

(a) We have that \[\begin{align*} f_X(x) = \frac{d}{dx}F_X(x) = \begin{cases} 0 & \text{ if } x \leq 0\\ 2cx & \text{ if } x \in (0, 1]\\ 0 &\text{ if } x>1 \end{cases} \,. \end{align*}\] At \(x = 1\), \(cx^2 = 1\), and so \(c = 1\).

(b) \(E[X] = \int_0^1 2x^2 dx = \frac{2}{3}\bigg|_0^1 = \frac{2}{3}\).


  • Problem 30

(a) A valid pdf has integral 1 over the domain. Thus \[\begin{align*} \int_0^1 c dx + \int_1^\infty e^{-x} dx &= 1 \\ c \int_0^1 dx &= 1 - \int_1^\infty e^{-x} dx \\ c \cdot 1 &= 1 + \left. e^{-x}\right|_1^\infty \\ c &= 1 + (0 - e^{-1}) = 1 - e^{-1} \,. \end{align*}\]

(b) The cdf for \(x \in [0,1)\) is \(c \int_0^x dz = cx\). For \(x \in [1,\infty)\), we have \[\begin{align*} \int_0^x f(z) dz &= \int_0^1 c dz + \int_1^x e^{-z} dz \\ &= c - \left. e^{-z}\right|_1^x = c - (e^{-x} - e^{-1}) = c + (e^{-1} - e^{-x}) \,. \end{align*}\] The cdf is thus

\(x\) \((-\infty,0)\) \([0,1)\) \([1,\infty)\)
\(F_X(x)\) 0 \(cx\) \(c + (e^{-1} - e^{-x})\)

  • Problem 31

(a) The sum of the probability masses has to equal 1, so \(c = 1 - 1/6 - 1/6 = 2/3\).

(b) The cdf is 0 for \(x \in (-\infty,-1)\), then 1/6 for \(x \in [-1,0)\), then 1/6+2/3 = 5/6 for \(x \in [0,1)\)…thus \(F_X(0.5) = 5/6\).

(c) The generalized inverse cdf is \[\begin{align*} x = F_X^{-1}(q) = \mbox{inf}\{ x : F_X(x) \geq q \} \,, \end{align*}\] i.e., the smallest value of \(x\) such that \(F_X(x) \geq q = 0.9\). Given that the cdf jumps from 5/6 (=0.833) to 1 at \(x=1\), the value that we want is \(x=1\).

(d) The variance is \(V[X] = E[X^2] - (E[X])^2\): \[\begin{align*} E[X] &= \sum_{x} x p_X(x) = -1 \cdot \frac16 + 0 \cdot \frac23 + 1 \cdot \frac16 = 0 \,. \\ E[X^2] &= \sum_{x} x^2 p_X(x) = (-1)^2 \cdot \frac16 + (0)^2 \cdot \frac23 + (1)^2 \cdot \frac16 = \frac13 \,. \end{align*}\] Thus \(V[X] = \frac13\).


  • Problem 32

(a) In the range \([0,1]\), \(f_X(x)\) is \[\begin{align*} f_X(x) = \frac{d}{dx} x^3 = 3x^2 \,. \end{align*}\]

(b) The probability is \[\begin{align*} P\left(\frac14 \leq X \leq \frac34\right) &= F_X\left(\frac34\right) - F_X\left(\frac14\right) = \left(\frac34\right)^3 - \left(\frac14\right)^3 = \frac{27}{64} - \frac{1}{64} = \frac{26}{64} = \frac{13}{32} \,. \end{align*}\]

(c) The conditional probability is \[\begin{align*} P\left(X \leq \frac14 \vert X \leq \frac12\right) &= \frac{P\left(X \leq \frac14 \cap X \leq \frac12\right)}{P\left(X \leq \frac12\right)} = \frac{P\left(X \leq \frac14\right)}{P\left(X \leq \frac12\right)} \\ &= \frac{F_X\left(\frac14\right)}{F_X\left(\frac12\right)} = \frac{(1/4)^3}{(1/2)^3} = \frac{8}{64} = \frac18 \,. \end{align*}\]

(d) The inverse cdf is given by \[\begin{align*} q = x^3 ~~~ \Rightarrow ~~~ x = F_X^{-1}(q) = q^{1/3} \,. \end{align*}\]


  • Problem 33

(a) The expected value is \[\begin{align*} E[Y] = E[X_1+2X_2-3X_3-4] = E[X_1] + 2E[X_2] - 3E[X_3] - 4 = 1 + 2 - 3 - 4 = -4 \,. \end{align*}\]

(b) The variance is \[\begin{align*} V[Y] = V[X_1+2X_2-3X_3-4] = V[X_1] + 4V[X_2] + 9V[X_3] = 1 + 4 + 9 = 14 \,. \end{align*}\]

(c) By “reversing” the shortcut formula, we find that \[\begin{align*} E[Y^2] = V[Y] + (E[Y])^2 = 14 + (-4)^2 = 30 \,. \end{align*}\]


  • Problem 34

(a) We know that \(F_X(2) = 1 = c \cdot 2^3\), so \(c = 1/8\).

(b) \(f_X(x) = (d/dx)F_X(x) = 3cx^2\).

(c) The conditional probability is \[\begin{align*} P(X < 1/2 \vert X < 1) &= \frac{P(X < 1/2 \cap X < 1)}{P(X < 1)} = \frac{P(X < 1/2)}{P(X < 1)} = \frac{F_X(1/2)}{F_X(1)} = \frac{(1/2)^3}{1^3} = 1/8 \,. \end{align*}\]


  • Problem 35

We are given that \(X \vert \theta\) is a continuous random variable, and that \(\Theta\) is a discrete random variable, so the appropriate Law of Total Probability expression is \[\begin{align*} f_X(x) &= \sum_{\theta} f_{X \vert \theta}(x \vert \theta) p_{\Theta}(\theta) \\ &= \sum_{\theta} \theta x^{\theta-1} p_{\Theta}(\theta) \\ &= 1 \cdot x^{1-1} \cdot p_{\Theta}(\theta = 1) + 2 \cdot x^{2-1} \cdot p_{\Theta}(\theta = 2) \\ &= 1 \cdot x^{1-1} \cdot 1/2 + 2 \cdot x^{2-1} \cdot 1/2 \\ &= x + 1/2 \,, \end{align*}\] with \(x \in [0,1]\).


  • Problem 36

(a) The expected value \(E[X]\) is \[\begin{align*} E[X] &= \int_0^1 x f_X(x) dx = \int_0^1 x 6 x (1-x) dx \\ &= 6 \left[ \int_0^1 x^2 dx - \int_0^1 x^3 dx \right] = 6 \left[ \left.\frac{x^3}{3}\right|_0^1 - \left.\frac{x^4}{4}\right|_0^1 \right] = 6 \left( \frac{1}{3} - \frac{1}{4} \right) = 6 \frac{1}{12} = \frac{1}{2} \,. \end{align*}\]

(b) We know that for any distribution, \(E[\bar{X}] = \mu = E[X]\). Hence \(E[\bar{X}] = 1/2\).


  • Problem 37

(a) The biases are \[\begin{align*} B[\hat{\mu}_1] &= E[\hat{\mu}_1 - \mu] = E[\hat{\mu}_1] = E\left[X_1-X_2\right] = E[X_1] - E[X_2] = \mu - \mu = 0 \\ B[\hat{\mu}_2] &= E[\hat{\mu}_2 - \mu] = E[\hat{\mu}_2] = E\left[\frac{X_1+X_2}{2}\right] = \frac12 \left(E[X_1] + E[X_2]\right) = \frac12 \left(\mu + \mu\right) = 0 \,. \end{align*}\] Both estimators are unbiased.

(b) The variances are \[\begin{align*} V[\hat{\mu}_1] &= V\left[X_1-X_2\right] = V[X_1] + V[X_2] = \sigma^2 + \sigma^2 = 2\sigma^2 = \frac23\\ V[\hat{\mu}_2] &= V\left[\frac{X_1+X_2}{2}\right] = \frac14 \left(V[X_1] + V[X_2]\right) = \frac14 \left(\sigma^2 + \sigma^2\right) = \frac{\sigma^2}{2} = \frac16\,. \end{align*}\]

(c) The mean-squared errors are \[\begin{align*} MSE[\hat{\mu}_1] &= (B[\hat{\mu}_1])^2 + V[\hat{\mu}_1] = 0^2 + 2\sigma^2 = 2\sigma^2 = \frac23 \\ MSE[\hat{\mu}_2] &= (B[\hat{\mu}_2])^2 + V[\hat{\mu}_2] = 0^2 + \frac{\sigma^2}{2} = \frac{\sigma^2}{2} = \frac16 \,. \end{align*}\]

(d) \(\hat{\mu}_2\) has the lower mean-squared error, so it is the better estimator.


  • Problem 38

(a) The likelihood function is \[\begin{align*} \mathcal{L}(a \vert \mathbf{x}) &= \prod_{i=1}^n f_X(x_i) = \prod_{i=1}^n a (1+x_i)^{-(a+1)} = a^n [\prod_{i=1}^n (1+x_i)]^{-(a+1)} \,. \end{align*}\] The log-likelihood function is thus \[\begin{align*} \ell(a \vert \mathbf{x}) &= \log \left[ a^n [\prod_{i=1}^n (1+x_i)]^{-(a+1)} \right] = \log a^n + \log \left[ [\prod_{i=1}^n (1+x_i)]^{-(a+1)} \right] \\ &= n \log a - (a+1) \log [\prod_{i=1}^n (1+x_i)] = n \log a - (a+1) \sum_{i=1}^n \log (1+x_i) \,. \end{align*}\]

(b) The maximum likelihood estimate is \[\begin{align*} \frac{d}{da} \ell(a \vert \mathbf{x}) &= \frac{n}{a} - \sum_{i=1}^n \log (1+x_i) = 0 \\ \Rightarrow ~~~ \frac{n}{a} &= \sum_{i=1}^n \log (1+x_i) \\ \Rightarrow ~~~ a &= \frac{n}{\sum_{i=1}^n \log (1+x_i)} \,. \end{align*}\] We rewrite this into “probabilistic” notation: \[\begin{align*} \hat{a}_{MLE} = \frac{n}{\sum_{i=1}^n \log (1+X_i)} \,. \end{align*}\]

(c) We invoke the invariance property of MLEs to write that \[\begin{align*} \hat{\mu}_{MLE} &= \frac{1}{\hat{a}_{MLE}-1} = \frac{1}{[n/\sum_{i=1}^n \log (1+X_i)]-1} = \frac{\sum_{i=1}^n \log (1+X_i)}{n-\sum_{i=1}^n \log (1+X_i)} \,. \end{align*}\]


  • Problem 39

(a) The log-likelihood is \[\begin{align*} \ell(\theta \vert \mathbf{x}) &= \sum_{i=1}^n \log f_X(x_i \vert \theta) = \sum_{i=1}^n \log \theta + (\theta-1) \log(1-x_i) = n \log \theta + (\theta - 1) \sum_{i=1}^n \log(1-x_i) \,. \end{align*}\]

(b) The first derivative of \(\ell(\theta \vert \mathbf{x})\) is \[\begin{align*} \frac{d}{d\theta} \left[ n \log \theta + (\theta - 1) \sum_{i=1}^n \log(1-x_i) \right] &= \frac{n}{\theta} + \sum_{i=1}^n \log(1-x_i) \,. \end{align*}\] We set this equal to zero and solve for \(\theta\): \[\begin{align*} \hat{\theta}_{MLE} = -\frac{n}{\sum_{i=1}^n \log(1-X_i)} \,. \end{align*}\]

(c) We utilize the invariance property of the MLE: \[\begin{align*} \hat{\mu}_{MLE} = \frac{1}{1 + \hat{\theta}_{MLE}} = \frac{\sum_{i=1}^n \log(1-X_i)}{\left[ \sum_{i=1}^n \log(1-X_i) \right] - n} \,. \end{align*}\]


  • Problem 40

We first derive the likelihood \[\begin{align*} \ell(x_1, x_2, \dots, x_n | \alpha) &= \sum_{i=1}^n \left[ \log \alpha + (\alpha-1) \log x_i \right] = n \log \alpha + (\alpha - 1) \sum_{i=1}^n \log x_i\\ \implies ~~~ \ell'(x_1, x_2, \dots, x_n | \alpha) &= \frac{n}{\alpha} + \sum_{i=1}^n \log x_i \,. \end{align*}\] We set \(\frac{n}{\hat{\alpha}_{MLE}} + \sum_{i=1}^n \log x_i = 0\) and find that \[\begin{align*} \hat{\alpha}_{MLE} = \frac{n}{- \sum_{i=1}^n \log X_i)} \,. \end{align*}\]


  • Problem 41

(a) The log-likelihood and its derivative are \[\begin{align*} \ell (x_1,\dots, x_n | p) &= \log (1-p) \sum_{i=1}^n (x_i - 1) + n \log p\\ \implies \ell' (x_1,\dots, x_n | \hat{p}) &= - \frac{\sum_{i=1}^n x_i - n}{1 - \hat{p}} + \frac{n}{\hat{p}} = 0 \,. \end{align*}\] From \[\begin{align*} \frac{\sum_{i=1}^n x_i - n}{1 - \hat{p}} = \frac{n}{\hat{p}} \quad \Rightarrow \quad \left(\sum_{i=1}^n x_i - n\right) \hat{p} = n (1 - \hat{p}) \quad \Rightarrow \quad \hat{p} \,\sum_{i=1}^n x_i = n \,, \end{align*}\] it follows that \(\hat{p}_{MLE} = \frac{n}{\sum_{i=1}^n X_i}\). By the invariance property of the MLE, we find that \[\begin{align*} \widehat{1/p}_{MLE} = \frac{\sum_{i=1}^n X_i}{n} \,. \end{align*}\]

(b) The variance of this estimator is \[\begin{align*} V\left[\widehat{1/p}_{MLE} \right] = V\left[\frac{\sum_{i=1}^n X_i}{n} \right] = \frac{V(X_1)}{n} = \frac{1-p}{np^2} \,. \end{align*}\]


  • Problem 42

(a) The bias of the estimator is \[\begin{align*} B[\hat{\theta}] = E[\hat{\theta}-\theta] = E[\hat{\theta}]-\theta = E\left[\frac{X_1-X_2}{2}\right] - \theta = \frac12 ( E[X_1] - E[X_2] ) - \theta = \frac12 ( \theta - \theta ) - \theta = -\theta \,. \end{align*}\]

(b) The variance of the estimator is \[\begin{align*} V[\hat{\theta}] = V\left[\frac{X_1-X_2}{2}\right] = \frac14(V[X_1]+V[X_2]) = \frac14 \frac{2\theta^2}{12} = \frac{\theta^2}{6} \,. \end{align*}\]

(c) The mean-squared error of the estimator is \[\begin{align*} MSE[\hat{\theta}] = (B[\hat{\theta}])^2 + V[\hat{\theta}] = \theta^2 + \frac{\theta^2}{6} = \frac76\theta^2 \,. \end{align*}\]

(d) Changing the sign does not change the variance! And “unbiased” means the bias is zero. So: \[\begin{align*} MSE[\hat{\theta}'] = V[\hat{\theta}] = \frac{\theta}{12} \,. \end{align*}\]


  • Problem 43

(a) The bias of the estimator is \[\begin{align*} B[\hat{\theta}] = E[\hat{\theta}-\theta] = E\left[X_1 - \frac{X_2}{n}\right] - \theta = E[X_1] - \frac{E[X_2]}{n} - \theta = \theta - \frac{\theta}{n} - \theta = -\frac{\theta}{n} \,. \end{align*}\]

(b) The variance of the estimator is \[\begin{align*} V[\hat{\theta}] = V\left[X_1 - \frac{X_2}{n}\right] = V[X_1] + \frac{1}{n^2}V[X_2] = \left(1+\frac{1}{n^2}\right)\theta^2 \,. \end{align*}\]


  • Problem 44

(a) The likelihood is \[\begin{align*} \mathcal{L}(\theta \vert \mathbf{x}) &= \prod_{i=1}^n f_X(x_i \vert \theta) = \prod_{i=1}^n c e^{-\theta x_i} \theta^{x_i-1} = c^n e^{-\theta \sum_{i=1}^n x_i} \theta^{\sum_{i=1}^n x_i - n} \,. \end{align*}\] Thus the log-likelihood is \[\begin{align*} \ell(\theta \vert \mathbf{x}) &= n \log c - \theta \sum_{i=1}^n x_i + (\sum_{i=1}^n x_i - n) \log \theta \,. \end{align*}\]

(b) The derivative of \(\ell(\theta \vert \mathbf{x})\) with respect to \(\theta\) is \[\begin{align*} \ell'(\theta \vert \mathbf{x}) &= - \sum_{i=1}^n x_i + \frac{\sum_{i=1}^n x_i - n}{\theta} \,. \end{align*}\] Setting this to zero and solving for \(\theta\), we get \[\begin{align*} \hat{\theta}_{MLE} = \frac{\sum_{i=1}^n X_i - n}{\sum_{i=1}^n X_i} = \frac{\bar{X}-1}{\bar{X}} \,. \end{align*}\]

(c) We utilize the invariance property of the MLE: \[\begin{align*} \hat{\mu}_{MLE} = \frac{1}{1-\hat{\theta}_{MLE}} = \frac{1}{1-(\bar{X}-1)/\bar{X})} = \frac{\bar{X}}{\bar{X}-(\bar{X}-1)} = \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i \,. \end{align*}\]


  • Problem 45

(a) The cdf for the given pdf is \[\begin{align*} F_Y(y) = \int_0^y a(1+v)^{-a-1} dv \,. \end{align*}\] This is a \(u^n du\)-style integral, meaning that \[\begin{align*} F_Y(y) &= \int_0^y a(1+v)^{-a-1} dv = a \frac{1}{-a} \left. (1+v)^{-a}\right|_0^y = -\left((1+y)^{-a} - 1^{-a}\right) = 1 - (1+y)^{-a} \,. \end{align*}\]

(b) \(E[Y]\) decreases with \(a\); combining that fact with our desire to determine a lower bound, we focus on the fourth line of the confidence interval reference table and solve \[\begin{align*} F_Y(y_{\rm obs} \vert a) - \alpha &= 0 \\ \Rightarrow ~~~ 1 - (1+y_{\rm obs})^{-a} - \alpha &= 0 \\ \Rightarrow ~~~ (1+y_{\rm obs})^{-a} &= 1 - \alpha \\ \Rightarrow ~~~ -a \log (1+y_{\rm obs}) &= \log(1-\alpha) \\ \Rightarrow ~~~ \hat{a}_L &= -\frac{\log(1-\alpha)}{\log(1+y_{\rm obs})} \,. \end{align*}\]


  • Problem 46

(a) The fact that we wish to perform a lower-tail test, combined with the fact that \(E[Y]\) increases with \(\mu\), means that we will focus on the third line of the hypothesis test reference table and solve for \[\begin{align*} y_{\rm RR} = F_Y^{-1}(\alpha \vert \mu_o) \,. \end{align*}\] We have that \[\begin{align*} F_Y(y_{\rm RR} \vert \mu_o) - \alpha &= 0 \\ \Rightarrow ~~~ \frac{1}{1+\exp(-y_{\rm RR})} &= \alpha \\ \Rightarrow ~~~ 1+\exp(-y_{\rm RR}) &= \frac{1}{\alpha} \\ \Rightarrow ~~~ \exp(-y_{\rm RR}) &= \frac{1}{\alpha}-1 \\ \Rightarrow ~~~ y_{\rm RR} &= -\log\left(\frac{1}{\alpha}-1\right) \,. \end{align*}\] Since \(1/0.05 = 20\), we find that \(y_{\rm RR} = -\log(19)\).

(b) As stated in the hypothesis test reference table, the rejection region is \(y_{\rm obs} < y_{\rm RR}\). Thus if \(y_{\rm obs} \geq y_{\rm RR}\), we fail to reject the null hypothesis.

(c) If we go from a lower-tail test (with \(E[Y]\) increasing with \(\mu\)) to a two-tail test, then the lower rejection region boundary will shift because we put \(\alpha/2 = 0.025\) into the equation rather than \(\alpha = 0.05\). This moves the boundary further from \(\mu_o = 0\). (Specifically, \(1/\alpha\) would now be 40, so the new \(y_{\rm RR} = -\log(39)\), which is further from 0 than \(-\log(19)\).)

Chapter 2

  • Problem 1

(a) We can rewrite the conditional expression as the ratio of two unconditional probabilities and work from there: \[\begin{align*} P(X>2 \vert X>0) &= \frac{P(X>2 \cap X>0)}{P(X>0)} = \frac{P(X>2)}{P(X>0)} = \frac{P(Z>\frac{2-4}{2})}{P(Z>\frac{0-4}{2})} \\ &= \frac{P(Z>-1)}{P(Z>-2)} = \frac{1-P(Z\leq-1)}{1-P(Z\leq-2)} = \frac{1-\Phi(-1)}{1- \Phi(-2)} = \frac{\Phi(1)}{\Phi(2)} \,. \end{align*}\] In the last step we take advantage of the symmetry of the standard normal: \(\Phi(z) = 1 - \Phi(-z)\).

(b) \[\begin{align*} P(X<c \vert X<4) &= 0.5 = \frac{P(X<c \cap X<4)}{P(X<4)}\\ &= \frac{P(X<c)}{P(X<4)} = \frac{P(z<\frac{c-4}{2})}{P(z<\frac{4-4}{2})} = \frac{\Phi\left(\frac{c-4}{2}\right)}{\Phi(0)} = 2\Phi\left(\frac{c-4}{2}\right) \,. \end{align*}\] Therefore \[\begin{align*} \Phi\left(\frac{c-4}{2}\right) &= \frac14 \\ \Rightarrow ~~~ \frac{c-4}{2} &= \Phi^{-1}\left( \frac14 \right)\\ \Rightarrow ~~~ c &= 4 + 2 \Phi^{-1}\left( \frac14 \right) \,. \end{align*}\]


  • Problem 2

We utilize the Law of the Unconscious Statistician here: \[\begin{align*} E[e^{-Z^2/2}] &= \int_{-\infty}^{\infty} e^{-z^2/2} f_Z(z) dz = \frac{1}{\sqrt{2 \pi}}\int_{-\infty}^{\infty} e^{-z^2/2} e^{-z^2/2} dz = \frac{1}{\sqrt{2 \pi}}\int_{-\infty}^{\infty} e^{-z^2} dz \,. \end{align*}\] We do not know how to do this integral by hand. However, if we make the substitution \(x = \sqrt{2}z\) (with \(dx = \sqrt{2}dz\)), then \[\begin{align*} E[e^{-Z^2/2}] &= \frac{1}{\sqrt{2}\sqrt{2 \pi}}\int_{-\infty}^{\infty} e^{-x^2/2} dx = \frac{1}{\sqrt{2}} \left( \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty} e^{-x^2/2} dx \right) = \frac{1}{\sqrt{2}} \,. \end{align*}\] In the last step, we take advantage of the fact that the integral is that of a normal distribution with mean \(\mu = 0\) and variance \(\sigma^2 = 1\).


  • Problem 3

(a) Rejection occurs if \(M\) is \(< 4\) or \(> 6\): \[\begin{align*} P(M < 4 \cup M > 6) &= P\left(Z < \frac{4-5}{2}\right) + P\left(Z > \frac{6-5}{2}\right) \\ &= \Phi\left(-\frac12\right) + \left(1 - \Phi\left(\frac12\right)\right) \\ &= \Phi\left(-\frac12\right) + \Phi\left(-\frac12\right) = 2 \Phi\left(-\frac12\right) \,. \end{align*}\]

(b) We start with the expression \[\begin{align*} \Phi\left(-\frac{1}{\sigma}\right)+ \left(1 - \Phi\left(-\frac{1}{\sigma}\right)\right) = 2 \Phi\left(-\frac{1}{\sigma}\right) = 0.02 \,. \end{align*}\] We then solve for \(\sigma\): \[\begin{align*} \Phi\left(-\frac{1}{\sigma}\right) &= 0.01\\ \Rightarrow ~~~ \frac{-1}{\sigma} &= \Phi^{-1}(0.01)\\ \Rightarrow ~~~ \sigma &= -\frac{1}{\Phi^{-1}(0.01)} \,. \end{align*}\]


  • Problem 4

We want to determine the value of \(c\) such that \(P(X_1 < \mu-c \cap X_2 > \mu) = 1/4\). Because \(X_1\) and \(X_2\) are independent, we know that \[\begin{align*} P(X_1 < \mu-c \cap X_2 > \mu) = P(X_1 < \mu-c)P(X_2 > \mu) \,. \end{align*}\] Thus \[\begin{align*} \frac14 &= P(X_1 < \mu-c)P(X_2 > \mu) = P\left(Z_1 < \frac{\mu-c-\mu}{\sigma}\right)P\left(Z_2 > \frac{\mu-\mu}{\sigma}\right) \\ &= P\left(Z_1 < -\frac{c}{\sigma}\right)P(Z_2 > 0) = \Phi\left(-\frac{c}{\sigma}\right)[1 - \Phi(0)] \\ &= \frac12 \Phi\left(-\frac{c}{\sigma}\right) \\ \Rightarrow ~~~ \frac12 &= \Phi\left(-\frac{c}{\sigma}\right) \\ \Rightarrow ~~~ \Phi^{-1}\left(\frac12\right) &= -\frac{c}{\sigma} \\ \Rightarrow ~~~ c &= -\sigma \Phi^{-1}\left(\frac12\right) = 0 \,. \end{align*}\] Above, we take advantage of the fact that a normal with mean zero is symmetric about the \(y\)-axis, so \(\Phi(0) = 1/2\).


  • Problem 5

We can use the moment-generating function to generate moments, like the mean \(E[X]\): \[\begin{align*} E[X] = \left. \frac{dm}{dt} \right|_{t=0} = \left. \frac{(1-b^2t^2)ae^{at} - e^{at}(-2b^2t)}{(1-b^2t^2)^2)} \right|_{t=0} = \frac{(1-0)ae^0 + e^0(0)}{1-0)^2} = a \,. \end{align*}\]


  • Problem 6

(a) We utilize the Law of the Unconscious Statistician and write down that \[\begin{align*} m(t) = E[e^{tX}] = \sum_{x=0}^2 e^{tx} p_X(x) = e^0(0.2) + e^t(0.4) + e^{2t}(0.4) = 0.2 + 0.4e^t(1+e^t) \,. \end{align*}\]

(b) The first derivative of the mgf, evaluated at \(t = 0\), is the expected value: \[\begin{align*} E[X] = \left.\frac{d}{dt} m(t)\right|_{t=0} = \left.(0.4e^t + 0.8e^{2t})\right|_{t=0} = 0.4+0.8 = 1.2 \,. \end{align*}\]


  • Problem 7

(a) We utilize the Law of the Unconscious Statistician and write down that \[\begin{align*} m_X(t) = E[e^{tx}] = k \int_0^a e^{tx}e^{-x} dx = k \int_0^a e^{(t-1)x}dx = \frac{k}{(t-1)} e^{(t-1)x}\bigg|_0^a = \frac{k}{(t-1)}\left[ e^{(t-1)a} -1\right] \,. \end{align*}\]

(b) We can determine the expected value by differentiating \(m_X(t)\)\[\begin{align*} E[X] = \frac{d m_{X}(t)}{dt}\bigg|_0 = \frac{(t-1)ke^{(t-1)a}a - (ke^{(t-1)a} - k)}{(t-1)^2}\bigg|_0 = k - ke^{-a}(a+1) = k(1-e^{-a}(a+1)) \,. \end{align*}\] …or by brute-force, via integration by parts… \[\begin{align*} E[X] = \int_0^a kxe^{-x}dx = k\left( -xe^{-x}\bigg|_0^a - e^{-x}\bigg|_0^a\right) = k \left( -ae^{-a} + 0 - e^{-a} + 1\right) = k(1-e^{-a}(a+1)) \,. \end{align*}\]


  • Problem 8

(a) The calculations of the two moment-generating functions are similar: \[\begin{align*} m_{X_1}(t) &= E[e^{tX_1}] = 0.4 \cdot e^{t \cdot 0} + 0.6 \cdot e^{t \cdot 1} = 0.4 + 0.6e^t \\ m_{X_2}(t) &= E[e^{tX_2}] = 0.2 \cdot e^{t \cdot 0} + 0.8 \cdot e^{t \cdot 1} = 0.2 + 0.8e^t \,. \end{align*}\]

(b) Given that \(Y = X_1 + 2X_2\), we have that \[\begin{align*} m_Y(t) &= m_{X_1}(t) m_{X_2}(2t) = (0.4 + 0.6e^t)(0.2 + 0.8e^{2t}) \\ &= 0.08 + 0.12e^t + 0.32e^{2t} + 0.48e^{3t} \,. \end{align*}\]


  • Problem 9

(a) The moment-generating function is \[\begin{align*} E[e^{tX}] = \sum_x e^{tx} p_X(x) = \frac12 \left( e^{-t} + e^t \right) \,. \end{align*}\]

(b) If \(\bar{X} = (\sum_i X_i)/n\), then \[\begin{align*} m_{\bar{X}}(t) = \prod_{i=1}^n m_{X_i}\left(\frac{t}{n}\right) = \frac{1}{2^n} \left( e^{-t/n} + e^{t/n} \right)^n \,. \end{align*}\]


  • Problem 10

The moment-generating function is \[\begin{align*} E[e^{tX}] &= \int_0^{1/2} \frac13 e^{tx} dx + \int_{1/2}^1 \frac23 e^{tx} dx \\ &= \frac{1}{3t} \left. e^{tx} \right|_0^{1/2} + \frac{2}{3t} \left. e^{tx} \right|_{1/2}^1 \\ &= \frac{1}{3t} \left(e^{t/2}-1\right) + \frac{2}{3t} \left(e^t - e^{t/2}\right) \,. \end{align*}\] We can combine terms: \[\begin{align*} E[e^{tX}] = \frac{1}{3t}\left(e^{t/2} - 1 + 2e^t - 2e^{t/2}\right) = \frac{1}{3t} \left(2e^t - e^{t/2} - 1\right) \,. \end{align*}\]


  • Problem 11

(a) Because the variance is unknown, we make inferences about \(\mu\) utilizing the \(t\) distribution: \[\begin{align*} P(3 \leq \bar{X} \leq 6) &= P\left( \frac{3-\mu}{S/\sqrt{n}} \leq \frac{\bar{X} -\mu}{S/\sqrt{n}} \leq \frac{6 - \mu}{S/\sqrt{n}} \right) \\ &= P\left(-\frac{2}{3/4} \leq T \leq \frac{1}{3/4}\right) = F_T\left(\frac43\right) - F_T\left(-\frac83\right) \,, \end{align*}\] where the number of degrees of freedom is \(n - 1 = 15\).

(b) As instructed, here we utilize the Central Limit Theorem: \[\begin{align*} P(3 \leq \bar{X} \leq 6) &= P\left( \frac{3-\mu}{\sigma/\sqrt{n}} \leq \frac{\bar{X} -\mu}{\sigma/\sqrt{n}} \leq \frac{6 - \mu}{\sigma/\sqrt{n}} \right) \\ &= P\left(-\frac{2}{5/7} \leq Z \leq \frac{1}{5/7}\right) = \Phi\left(\frac75\right) - \Phi\left(-\frac{14}{5}\right) \,. \end{align*}\]


  • Problem 12

(a) The moment-generating function for a normal distribution with mean 1 and variance 1 is \[\begin{align*} m_X(t) = \exp\left( \mu t + \frac{\sigma^2}{2}t^2 \right) = \exp\left( t + \frac12 t^2 \right) \,. \end{align*}\] Thus the mgf for \(Y = X_1 - X_2\) is \[\begin{align*} m_Y(t) = m_{X_1}(t) \cdot m_{X_2}(-t) = \exp\left( t + \frac12 t^2 \right) \exp\left( -t + \frac12 (-t)^2 \right) = \exp\left(t^2\right) \,. \end{align*}\] This is the mgf for a normal distribution with mean 0 and variance 2.

(b) This is far simpler than it might first appear: \[\begin{align*} P(X_1 \geq X_2) = P(X_1 - X_2 \geq 0) = \frac12 \,. \end{align*}\] This follows from the fact that the distribution for \(X_1 - X_2\) (derived in part (a)) is symmetric around the coordinate \(x=0\).

(c) We have that \[\begin{align*} P(X_1 \leq 0 \cup X_2 \geq 2) &= P(X_1 \leq 0) + P(X_1 \geq 2) \\ &= P(X_1 \leq 0) + P(X_1 \leq 0) = 2P(X_1 \leq 0)\\ &= 2P\left(\frac{X_1-\mu}{\sigma} \leq \frac{0-\mu}{\sigma}\right) \\ &= 2P\left(Z \leq -\frac{1}{\sigma}\right) = 2\Phi\left(-\frac{1}{\sigma}\right) = 0.2 \\ \Rightarrow ~~~ \Phi\left(-\frac{1}{\sigma}\right) &= 0.1 \\ \Rightarrow ~~~ -\frac{1}{\sigma} &= \Phi^{-1}(0.1) \\ \Rightarrow ~~~ \sigma &= -\frac{1}{\Phi^{-1}(0.1)} \,. \end{align*}\]


  • Problem 13

We can answer this question using either moment-generating functions or a general transformation.

The mgf of \(\bar{X}\) is \[\begin{align*} m_{\bar{X}}(t) = \exp \left( \mu t + \frac{\sigma^2 t}{2n}\right) \,. \end{align*}\] Let \(U = \bar{X} - \mu\). Then, \[\begin{align*} m_U(t) = e^{-\mu t} m_{\bar{X}}(t) = \exp(-\mu t)\exp\left(\mu t + \frac{\sigma^2 t}{2n}\right) = \exp\left(\frac{\sigma^2 t}{2n}\right) \,. \end{align*}\] This is the mgf for a normal distribution with mean 0 and variance \(\sigma^2/n\).

Alternatively, let \(U = \bar{X}-\mu\), so \(\bar{X} = U + \mu\), and \[\begin{align*} F_U(u) = P(U \leq u) = P(g(\bar{X}) \leq u) = P(\bar{X} \leq g^{-1}(u)) &= P(\bar{X} \leq u + \mu) \\ &= \int_{-\infty}^{u+\mu} f_{\bar{X}}(\bar{x}) d\bar{x} \\ &= F_{\bar{X}}(u+\mu) \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du}F_U(u) = \frac{d}{du}F_{\bar{X}}(u+\mu) &= f_{\bar{X}}(u+\mu) \\ &= \frac{1}{\sqrt{2 \pi (\sigma^2/n)}} \exp \left( - \frac{(u + \mu - \mu)^2}{2(\sigma^2/n)} \right) \\ &= \frac{1}{\sqrt{2 \pi (\sigma^2/n)}} \exp \left( - \frac{u^2}{2(\sigma^2/n)} \right) \,. \end{align*}\] This is the pdf for a normal distribution with mean 0 and variance \(\sigma^2/n\).


  • Problem 14

We want to write down an evaluatable expression for \(P(\vert \bar{X} - \mu \vert \leq 1)\), meaning that we want to change the expression inside the parantheses so that the statistic matches one for which we know the sampling distribution: \[\begin{align*} P(\vert \bar{X} - \mu \vert \leq 1) &= P\left( \frac{\vert \bar{S} - \mu \vert}{S/\sqrt{n}} \leq \frac{1}{S/\sqrt{n}}\right) = P\left(\vert T \vert \leq \frac{\sqrt{n}}{S}\right) \\ &= P\left(-\frac{4}{2} \leq T \leq \frac{4}{2}\right) = P(-2 \leq T \leq 2) \,, \end{align*}\] where \(T\) is a \(t\)-distributed random variable for \(\nu = n-1 = 15\) dof.


  • Problem 15

We are given that \(X \sim \mathcal{N}(30,\sigma^2)\), so \[\begin{align*} P(X \geq 25) = P\left(Z \geq \frac{25-30}{\sigma}\right) = 1 - P\left(Z \leq -\frac{5}{\sigma}\right) = 1 - \Phi\left(-\frac{5}{\sigma}\right) = 0.9 \,. \end{align*}\] Thus \[\begin{align*} \Phi\left( -\frac{5}{\sigma}\right) &= 0.1 \\ \Rightarrow ~~~ -\frac{5}{\sigma} = \Phi^{-1}(0.1) \\ \Rightarrow ~~~ \sigma = -\frac{5}{\Phi^{-1}(0.1)} \,. \end{align*}\]


  • Problem 16

We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(X^3 \leq u) = P(X \leq u^{1/3}) = \int_0^{u^{1/3}} 3x^2 dx = \left. x^3 \right|_0^{u^{1/3}} = u \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du} F_U(u) = 1 \,, \end{align*}\] for \(u \in [0,1]\).


  • Problem 17

We utilize the general transformation framework here: \[\begin{align*} F_W(w) &= P(W \leq w) = P(X^2 \leq w) = P(X \leq \sqrt{w}) = \int_0^{\sqrt{w}} dx = \left. x \right|_0^{\sqrt{w}} = \sqrt{w} \,. \end{align*}\] (Note that we know \(x > 0\), so we need not worry about the root \(-\sqrt{w}\).) Thus \[\begin{align*} f_W(w) = \frac{d}{dw} \sqrt{w} = \frac{1}{2\sqrt{w}} \,, \end{align*}\] for \(w \in [0,1]\). The desired probability is thus \[\begin{align*} P(1/4 \leq w \leq 1/2) = \int_{1/4}^{1/2} \frac12 w^{-1/2} dw = \left. w^{1/2} \right|_{1/4}^{1/2} = \frac{\sqrt{2}}{2} - \frac12 = 0.207 \,. \end{align*}\]


  • Problem 18

We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(X^2 + 4 \leq u) = P(X \leq \sqrt{u-4}) = \int_0^{\sqrt{u-4}} e^{-x} dx \\ &= -e^{-x}\bigg|_0^{\sqrt{u-4}} = 1 - e^{-\sqrt{u-4}} \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du} F_U(u) = \frac{d}{du}(1 - e^{-\sqrt{u-4}}) = \frac{1}{2\sqrt{u-4}}e^{-\sqrt{u-4}} \,. \end{align*}\] The domain can be found in this instance by plugging in the lower bound of the domain of \(x\): \(x = 0 ~\Rightarrow~ u=4\). Thus \(u \in [4,\infty)\).


  • Problem 19

(a) We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(-2X \leq u) = P(X \geq -u/2) = \frac{1}{\beta} \int_{-u/2}^{\infty} e^{-x/\beta} dx = -e^{-x/\beta}\bigg|_{-u/2}^{\infty} = e^{u/2\beta} \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du}e^{u/2\beta} = \frac{1}{2\beta}e^{u/2\beta} \,. \end{align*}\] Since \(U = -2X\), the domain is \(u \in (-\infty,0]\).

(b) No, since the domain, \((-\infty,0]\), is not the domain of an exponential distribution.


  • Problem 20

(a) We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(X^2 - 1 \leq u)\\ &= P(- \sqrt{u + 1} \leq X \leq \sqrt{u + 1}) = \int_{- \sqrt{u + 1}}^{\sqrt{u + 1}}\frac{dx}{2} = \sqrt{u+1} \,. \end{align*}\] (Because the domain for \(f_X(x)\) is \([-1,1]\), we have to be careful when setting the limits for the integral.) Thus \[\begin{align*} f_U(u) = \frac{d}{du} \sqrt{u+1} = \frac{1}{2 \sqrt{u+1}} \,. \end{align*}\] When we plot \(u = x^2-1\), we would see that over the range \(x \in [-1,1]\), the values of \(u\) range from \(-1\) to 0. Hence the domain of \(f_U(u)\) is \(u \in [-1,0]\).

(b) We ask for \(E[U+1]\) because that makes for a more straightforward integral than \(E[U]\): \[\begin{align*} E[U+1] = \int_{-1}^0 (u+1) \frac{du}{2\sqrt{u+1}} = \frac{1}{2}\int_{-1}^0 \sqrt{(u+1)} du = \frac{1}{3}(u+1)^{3/2}\bigg|_{-1}^0 = \frac{1}{3} \,. \end{align*}\]


  • Problem 21

We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(e^{-X} \leq u) = P(-X \leq \log(u)) = P(X \geq -\log(u)) \\ &= \int_{-\log(u)}^{1} 3x^2 dx = x^3\bigg|_{-\log(u)}^{1} = 1 + (\log(u))^3 \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du} (1 + (\log(u))^3) = 3(\log(u))^2\frac{1}{u} \,. \end{align*}\] As \(x\) goes from 0 to 1, \(u\) goes from 1 to \(e^{-1}\). Hence the domain of \(f_U(u)\) is \(u \in [e^{-1},1]\).


  • Problem 22

(a) We utilize the general transformation framework here: \[\begin{align*} F_U(u) &= P(U \leq u) = P(e^X \leq u) = P(X \leq \log u) = \int_0^{\log u} \theta x^{\theta-1} dx = \left. x^\theta \right|_0^{\log u} = (\log u)^\theta \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du}F_U(u) = \theta (\log u)^{\theta-1} \frac{1}{u} \,, \end{align*}\] for \(u \in [1,\exp(1)]\).

(b) If \(U = 2X\) and \(X = U/2\), then \[\begin{align*} F_U(u) = \ldots = \int_0^{u/2} \theta x^{\theta-1} dx = \left. x^\theta \right|_0^{u/2} = \left(\frac{u}{2}\right)^\theta \,. \end{align*}\] Thus \[\begin{align*} P(U \geq 1) = 1 - P(U < 1) = 1 - F_U(1) = 1 - \left(\frac12\right)^\theta \,. \end{align*}\] There is no need to derive \(f_U(u)\) here.


  • Problem 23

We do not know (or cannot easily work with) the sampling distribution for \(S^2\), so we algebraically transform the quantities within the probability expression: \[\begin{align*} P(1 \leq S^2 \leq 2) = P\left( \frac{(n-1) \cdot 1}{\sigma^2} \leq \frac{(n-1)S^2}{\sigma^2} \leq \frac{(n-1) \cdot 2}{\sigma^2} \right) \,. \end{align*}\] So \[\begin{align*} X &= \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2 = \chi_{10}^2 \\ a &= \frac{n-1}{\sigma^2} = 5 \\ b &= \frac{2(n-1)}{\sigma^2} = 10 \,. \end{align*}\]


  • Problem 24

We do not know the distribution for \(\bar{X}\sqrt{n}/S\), but we do know that \(\frac{\bar{X} - \mu}{S/\sqrt{n}} \sim t(n-1)\). Hence: \[\begin{align*} P\left(a \leq \frac{\bar{X}}{S/\sqrt{n}} \leq b\right) &= P\left(a - \frac{\mu}{S/\sqrt{n}}\leq \frac{\bar{X} -\mu}{S/\sqrt{n}} \leq b -\frac{\mu}{S/\sqrt{n}} \right) \\ &= F_T\left(b - \frac{\mu}{S/\sqrt{n}}\right) - F_T\left(a - \frac{\mu}{S/\sqrt{n}}\right) \,, \end{align*}\] where \(F_T\) is the cdf for \(t\) distribution with \(n-1\) dof.


  • Problem 25

(a) We do not know the sampling distribution for \(\bar{X}/S\), so we algebraically transform the quantities within the probability expression: \[\begin{align*} P\left(\frac{\bar{X}}{S} \leq a\right) &= P\left(\frac{\bar{X}}{S} - \frac{\mu}{S} \leq a- \frac{\mu}{S} \right) \\ &= P\left( \sqrt{n}\frac{\bar{X}-\mu}{S} \leq \sqrt{n}\left(a- \frac{\mu}{S} \right)\right) \\ &= P\left(T \leq \sqrt{n}\left(a- \frac{\mu}{S} \right)\right) = F_T\left(\sqrt{n}\left(a- \frac{\mu}{S} \right)\right) \,. \end{align*}\] where \(F_T\) is the cdf for \(t\) distribution with \(n-1\) dof.

(b) We have that \[\begin{align*} b = P\left(\frac{\bar{X}}{\sigma} \leq a\right) &= P\left(\frac{\bar{X}}{\sigma} - \frac{\mu}{\sigma} \leq a- \frac{\mu}{\sigma} \right) \\ &= P\left( \sqrt{n}\frac{\bar{X}-\mu}{\sigma} \leq \sqrt{n}\left(a- \frac{\mu}{\sigma} \right)\right) \\ &= P\left(Z \leq \sqrt{n}\left(a- \frac{\mu}{\sigma} \right)\right) = \Phi\left(\sqrt{n}\left(a- \frac{\mu}{\sigma} \right)\right) \,. \end{align*}\] Thus \[\begin{align*} \sqrt{n}\left(a- \frac{\mu}{\sigma} \right) &= \Phi^{-1}(b) \\ \Rightarrow ~~~ n &= \left(\frac{\Phi^{-1}(b)}{a - \mu/\sigma}\right)^2 \,. \end{align*}\]


  • Problem 26

(a) The Fisher information for a single datum, \(I(\theta)\), is \[\begin{align*} I(\theta) = -E\left[ \frac{d^2}{d \theta^2} \log f_X(x \vert \theta) \right] \,.\end{align*}\] So we start by finding the log of \(f_X(x \vert \theta)\): \[\begin{align*} \log f_X(x \vert \theta) = \log \theta + (\theta-1) \log(1-x) \,, \end{align*}\] and then we derive the second derivative: \[\begin{align*} \frac{d}{d\theta} f_X(x \vert \theta) &= \frac{1}{\theta} + \log(1-x) \\ \frac{d^2}{d\theta^2} f_X(x \vert \theta) &= -\frac{1}{\theta^2} \,. \end{align*}\] Last, we find the expected value: \[\begin{align*} I(\theta) = -E\left[ -\frac{1}{\theta^2} \right] = \frac{1}{\theta^2} \,. \end{align*}\]

(b) The Cramer-Rao Lower Bound is \[\begin{align*} \frac{1}{I_n(\theta)} = \frac{1}{nI(\theta)} = \frac{1}{n(1/\theta^2)} = \frac{\theta^2}{n} \,. \end{align*}\]


  • Problem 27

(a) The log-likelihood is given by \[\begin{align*} \ell(\theta \vert x) = -n \log \theta + \sum_{i=1}^n x_i + \frac{n}{\theta} - \frac{\sum_{i=1}^n e^{x_i}}{\theta} \,, \end{align*}\] and its first derivative is given by \[\begin{align*} \frac{d}{d\theta} \ell(\theta \vert x) = -\frac{n}{\theta} - \frac{n}{\theta^2} + \frac{\sum_{i=1}^n e^{x_i}}{\theta^2} \,. \end{align*}\] If we set this expression to zero, we find that \[\begin{align*} 0 &= -n - \frac{n}{\theta} + \frac{\sum_{i=1}^n e^{x_i}}{\theta} \\ \Rightarrow ~~~ \frac{\sum_{i=1}^n e^{x_i}-n}{\theta} &= n \\ \Rightarrow ~~~ \theta &= \frac{1}{n} \left( \sum_{i=1}^n e^{x_i}-n \right) \\ \Rightarrow ~~~ \hat{\theta}_{MLE} &= \frac{1}{n} \left( \sum_{i=1}^n e^{X_i}-n \right) \,. \end{align*}\]

(b) We find the Cramer-Rao Lower Bound as follows: \[\begin{align*} \log f_X(x) &= -\log (\theta) + x + \frac{1}{\theta} -\frac{e^x}{\theta} \\ \Rightarrow ~~~ \frac{d}{d\theta} \log f_X(x) &= -\frac{1}{\theta} - \frac{1}{\theta^2} + \frac{1}{\theta^2} e^x \\ \Rightarrow ~~~ \frac{d^2}{d\theta^2} \log f_X(x) &= \frac{1}{\theta^2} + \frac{2}{\theta^3} - \frac{2e^x}{\theta^3} \,. \end{align*}\] Thus \[\begin{align*} I(\theta) &= -E\left[\frac{d^2}{d\theta^2} \log f(X)\right] = -\frac{1}{\theta^2} -\frac{2}{\theta^3} + \frac{2}{\theta^3} E[e^X] \\ &= -\frac{1}{\theta^2} -\frac{2}{\theta^3} + \frac{2}{\theta^3}(\theta+1) = \frac{1}{\theta^2} \,, \end{align*}\] and thus \(I_n(\theta) = n/\theta^2\) and \[\begin{align*} V[\hat\theta] \geq \frac{\theta^2}{n} \,. \end{align*}\] The variance of the MLE is \(\theta^2 / n\), because \[\begin{align*} V\left[ \frac{1}{n} \sum_{i=1}^n e^{X_i} - 1 \right] = V\left[ \frac{1}{n} \sum_{i=1}^n e^{X_i} \right] = \frac{1}{n^2} V\left[ \sum_{i=1}^n e^{X_i} \right] = \frac{n}{n^2} V\left[ e^X \right] = \frac{\theta^2}{n} \,. \end{align*}\] Thus the MLE achieves the CRLB.


  • Problem 28

The log-likelihood is \[\begin{align*} \ell(p \vert \mathbf{x}) = \sum_{i = 1}^n [x_i \log(p) + (1 - x_i) \log (1-p)]. \end{align*}\] The first two derivatives are \[\begin{align*} \frac{d}{dp} \ell(p \vert \mathbf{x}) &= \sum_{i = 1}^n \bigg[ \frac{x_i}{p} - \frac{(1 - x_i)}{1-p}\bigg] \\ \frac{d^2}{dp^2} \ell(p \vert \mathbf{x}) &= \sum_{i = 1}^n \bigg[-\frac{x_i}{p^2}- \frac{(1- x_i)}{(1-p)^2}\bigg] \,. \end{align*}\] The Fisher information \(I_n(p)\) is therefore \[\begin{align*} I_n(p) = E\bigg[-\frac{d^2}{dp^2} \ell(p \vert \mathbf{X} )\bigg] = \sum_{i = 1}^n \bigg[\frac{E[X_i]}{p^2} + \frac{E[(1- X_i)]}{(1-p)^2}\bigg] = \frac{n}{p(1-p)} \,, \end{align*}\] and the asymptotic distribution of the maximum likelihood estimate \(\hat{p}_{MLE}\) is thus \(\mathcal{N}(p,p(1-p)/n)\). (Note: we never had to explicitly derive the MLE to determine its asymptotic distribution.)


  • Problem 29

(a) The Fisher information \(I(a)\) is \[\begin{align*} I(a) = -E\left[\frac{\partial^2}{\partial a^2} \log f_X(x \vert a)\right] \,, \end{align*}\] where
\[\begin{align*} \log f_X(x \vert a) = \log a - (a+1) \log x \,. \end{align*}\]
Thus
\[\begin{align*} I(a) = -E\left[\frac{\partial^2}{\partial a^2} (\log a - (a+1) \log x)\right] = -E\left[\frac{\partial}{\partial a} \left(\frac{1}{a} - \log x\right)\right] = -E\left[-\left(\frac{1}{a^2}\right)\right] = \frac{1}{a^2} \,. \end{align*}\]

(b) The Cramer-Rao Lower Bound is \(1/I_n(a)\), where \(I_n(a) = nI(a)\), and is thus \(a^2/n\).

(c) The maximum likelihood estimator for a distribution parameter converges in distribution to a normal random variable as \(n \rightarrow \infty\), with zero bias and variance given by the CRLB. Thus in the asymptotic limit, \[\begin{align*} \hat{a}_{MLE} \sim \mathcal{N}\left(a,\frac{a^2}{n}\right) \,. \end{align*}\]


  • Problem 30

(a) We have that \[\begin{align*} P(\vert \bar{X} - \mu \vert \geq 1) &= P\left(\frac{\vert \bar{X} - \mu \vert}{\sigma/\sqrt{n}} \geq \frac{1}{\sigma/\sqrt{n}}\right) = P\left( \vert Z \vert \geq \frac{\sqrt{n}}{\sigma} \right) \\ &= P\left(Z \leq -\frac{\sqrt{n}}{\sigma} \cup Z \geq \frac{\sqrt{n}}{\sigma}\right) = P(Z \leq -2 \cup Z \geq 2) \\ &= \Phi(-2) + (1-\Phi(2)) = 2\Phi(-2) \,. \end{align*}\]

(b) We know that \(X_+ = n\bar{X}\), so since \(\bar{X}\) is distributed normally via the CLT, \(X_+\) is as well. We also know that \(E[X_+] = nE[\bar{X}] = 100 \cdot 10 = 1000\), and \(V[X_+] = n^2V[\bar{X}] = 10,000 \cdot \frac{25}{100} = 2500\). Hence \(X_+ \sim \mathcal{N}(1000,2500)\).


  • Problem 31

Because the distribution for the time needed to complete the task is not given, we must fall back on the Central Limit Theorem: \[\begin{align*} T_{\text{A}} \sim \mathcal{N}(\mu_{\text{A}}, \frac{\sigma^2_{\text{A}}}{n}) &= \mathcal{N}(40,36/36=1) \\ T_{\text{B}} \sim \mathcal{N}(\mu_{\text{B}}, \frac{\sigma^2_{\text{B}}}{n}) &= \mathcal{N}(38,36/36=1) \,. \end{align*}\] We are interested in computing the probability \(P(T_{\text{A}} - T_{\text{B}} < 0)\). Using the method of moment-generating functions, we know that \[\begin{align*} m_{\Delta T}(t) &= m_{T_{\text{A}}}(t) m_{T_{\text{B}}}(t) = \exp\left(\mu_{\text{A}}t + \frac{\sigma^2_{\text{A}}t^2}{2n}\right)\exp\left(-\mu_{\text{B}}t + \frac{\sigma^2_{\text{B}}t^2}{2n}\right)\\ &= \exp\left((\mu_{\text{A}} -\mu_{\text{B}}) t + \left(\frac{\sigma^2_{\text{A}}+\sigma^2_{\text{B}}}{n}\right)\frac{t^2}{2}\right) \,. \end{align*}\] and thus, approximately, \[\begin{align*} T_{\text{A}} - T_{\text{B}} \sim \mathcal{N}(\mu_{\text{A}} - \mu_{\text{B}}, ((\sigma^2_{\text{A}} + \sigma^2_{\text{B}})/n) = \mathcal{N}(2,2) \,. \end{align*}\] So, in the end, we have that \[\begin{align*} P(T_{\text{A}} - T_{\text{B}} < 0) = P\left(\frac{T_{\text{A}} - T_{\text{B}}-2}{\sqrt{2}} < \frac{0-2}{\sqrt{2}}\right) = P(Z < -\sqrt{2}) = \Phi(-\sqrt{2}) \,. \end{align*}\]


  • Problem 32

Under the conditions of the Central Limit Theorem, we know that \(\bar{X} \sim \mathcal{N}(10,4^2/64 = 1/4)\). Therefore \[\begin{align*} P(\bar{X} > 9) = 1 - P(\bar{X} \leq 9) \approx 1 - P(Z \leq (9-10)\sqrt{2}) = 1 - \Phi(-2) = \Phi(2) \,. \end{align*}\]


  • Problem 33

We have that \(n = 300\) and that \(\sigma^2 = \frac{1}{12}(6^2 - 0^2) = 3\), so that \(\sigma = \sqrt{3}\) and \(\sigma/\sqrt{n} = \sqrt{3/300} = 0.1\). Therefore, using the Central Limit Theorem, we find that \[\begin{align*} P(2.8 \leq \bar{X} \leq 3.2) &= P\left( \frac{2.8 - 3}{0.1} \leq \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\leq \frac{3.2 - 3}{0.1} \right) \\ &\approx P(-2 \leq Z \leq 2) = \Phi(2) - \Phi(-2) = 2\Phi(2) - 1 = 1 - 2\Phi(-2) \,. \end{align*}\]


  • Problem 34

By the Central Limit Theorem, we know that \(X_+ \sim \mathcal{N}(n\mu,n\sigma^2)\), so \[\begin{align*} P(X_+ < (n+1)\mu) &= P\left( \frac{X_+ - n\mu}{\sqrt{n}\sigma} < \frac{(n+1)\mu - n\mu}{\sqrt{n}\sigma} \right) \\ &\approx P\left( Z < \frac{\mu}{\sqrt{n}\sigma} \right) = \Phi\left(\frac{\mu}{\sqrt{n}\sigma}\right) = \Phi\left(\frac{1}{2}\right) \,. \end{align*}\]


  • Problem 35

(a) This is a Central Limit Theorem question. Since \(n \geq 30\), we assume \(\bar{X} \sim \mathcal{N}(\mu,\sigma^2/n)\). Thus we have that \[\begin{align*} P\left(\bar{X} \leq \mu-\sigma/10\right) = P\left(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}} \leq \frac{\mu-\sigma/10-\mu}{\sigma/\sqrt{n}}\right) \approx P(Z \leq \sqrt{n}/10 ) = \Phi(-1) = 1 - \Phi(1) \,. \end{align*}\]

(b) If we don’t know \(\sigma^2\), we simply plug in \(S^2\) and proceed as we did before…remember, we never use the \(t\) distribution in CLT-related problems. \[\begin{align*} P\left(\bar{X} \geq \mu+1 \right) &= 1 - P\left(\frac{\bar{X}-\mu}{S/\sqrt{n}} \leq \frac{\mu + 1 - \mu}{S/\sqrt{n}}\right) \\ &\approx 1 - P\left(Z \leq \frac{\sqrt{n}}{S}\right) = 1 - \Phi\left(\frac{10}{5}\right) = 1 - \Phi(2) = \Phi(-2) \,. \end{align*}\]


  • Problem 36

(a) For a confidence interval, we need to solve the equation \[\begin{align*} F_Y(y_{\rm obs} \vert \theta) - q = 0 \end{align*}\] for \(\theta\), given \(q\). Here, we have the statistic \(w_{\rm obs} = (n-1)s_{\rm obs}^2/\sigma^2\), which is sampled from a chi-square distribution for \(n-1\) degrees of freedom. Hence the function we will use is pchisq().

(b) We are calling pchisq(), and the first argument would be the observed statistic value (n-1)*s2.obs/sigma2.

(c) We are solving for \(\sigma^2\), which has to be positive. So of the three choices, the only proper one is c(0.001,1000).


  • Problem 37

(a) To perform hypothesis tests about the variance of a normal distribution, the appropriate test statistic under the null is \(Y = (n-1)S^2/\sigma_o^2\), which is sampled from a chi-square distribution for \(n-1\) degrees of freedom.

(b) When testing hypotheses about the normal population variance, it is the case that lower values of \(\sigma^2\) lead to lower values of \(S^2\), hence we place the rejection region in the lower tail of the sampling distribution. Lower-tail rejection is \(c = F_W^{-1}(\alpha)\), with the number of degrees of freedom being \(n-1\).

(c) We start by noting that the power under the null is, by definition, \(\alpha\): \[\begin{align*} P(Y < c \vert \sigma^2 = \sigma_o^2) &= \alpha \\ P(Y < c) &= \, ?\\ P\left(Y\frac{\sigma_o^2}{\sigma_a^2} < c\frac{\sigma_o^2}{\sigma_a^2}\right) &= \, ?\\ P\left(\frac{(n-1)S^2}{\sigma_a^2} < c\frac{\sigma_o^2}{\sigma_a^2}\right) &= \, ?\\ P\left(\frac{(n-1)S^2}{\sigma_a^2} < c\frac{\sigma_o^2}{\sigma_a^2} \vert \sigma^2 = \sigma_a^2 \right) &= 1 - \beta = power\\ P\left(W < c\frac{\sigma_o^2}{\sigma_a^2} \vert \sigma^2 = \sigma_a^2 \right) &= 1 - \beta = power\\ \Rightarrow ~~ power = F_W\left(c\frac{\sigma_o^2}{\sigma_a^2}\right) \end{align*}\]

(d) We are performing a lower-tail test, so the power increases from \(\alpha\) to 1 as \(\sigma_a^2\) decreases from \(\sigma_o^2\), and decreases from \(\alpha\) to 0 as \(\sigma_a^2\) increases from \(\sigma_o^2\). Thus if \(\sigma_a^2 > \sigma_o^2\), the power is less than \(\alpha\).


  • Problem 38

(a) The \(p\)-value can be written down directly by plugging \(x_{\rm obs}\) and \(a_o\) into the given cdf, which, because we sample a single datum, is the cdf for the sampling distribution: \[\begin{align*} p = 1 - \left(\frac{1}{x_{\rm obs}}\right)^{a_o} = 1 - \left(\frac{4}{5}\right)^2 = 1 - \frac{16}{25} = \frac{9}{25} = 0.36 \,. \end{align*}\]

(b) The \(p\)-value is \(> \alpha\), so we “fail to reject” the null hypothesis and conclude that we have insufficient data to rule out that \(a = 2\).
(c) We solve this via the inverse cdf, utilizing an appropriate row of the hypothesis test reference table: \[\begin{align*} &\alpha = 1 - \left(\frac{1}{x_{\rm RR}}\right)^{a_o} \\ \Rightarrow ~~~ &\left(\frac{1}{x_{\rm RR}}\right)^{a_o} = 1 - \alpha \\ \Rightarrow ~~~ &\frac{1}{x_{\rm RR}} = (1-\alpha)^{1/a_o} \\ \Rightarrow ~~~ &x_{\rm RR} = \frac{1}{(1-\alpha)^{1/a_o}} \\ \Rightarrow ~~~ &x_{\rm RR} = \frac{1}{(0.95)^{1/2}} = 1.026 \,. \end{align*}\]

(d) The test power is \[\begin{align*} power(a=4) = 1 - \left(\frac{1}{x_{\rm RR}}\right)^{a} = 1 - \left((0.95)^{1/2}\right)^4 = 1 - (0.95)^2 = 0.0975 \,. \end{align*}\] (The test power is low…you cannot easily reject a null with a single datum.)


  • Problem 39

(a) The estimator for \(\hat{\beta}_1'\) is \[\begin{align*} \hat{\beta}_1' = \frac{(\sum_{i=1}^n Y_i a x_i) - n a \bar{x} \bar{Y}}{(\sum_{i=1}^n (ax_i)^2) - n a^2 \bar{x}^2} = \frac{a \left[(\sum_{i=1}^n Y_i x_i) - n \bar{x} \bar{Y}\right]}{a^2 \left[(\sum_{i=1}^n x_i^2) - n \bar{x}^2\right]} = \frac{\hat{\beta}_1}{a} \,. \end{align*}\]

(b) We have that \[\begin{align*} V[\hat{\beta}_1'] = V\left[\frac{\hat{\beta}_1}{a}\right] = \frac{1}{a^2} V[\hat{\beta}_1] \,. \end{align*}\]

(c) The test statistic is \[\begin{align*} T' = \frac{\hat{\beta}_1' - 0}{\sqrt{V[\hat{\beta}_1']}} = \frac{(1/a) \hat{\beta}_1}{(1/a) \sqrt{V[\hat{\beta}_1]}} = \frac{\hat{\beta}_1}{\sqrt{V[\hat{\beta}_1]}} = T \,. \end{align*}\]


  • Problem 40

(a) Since \(x_i\)s are fixed, \(E[Y_i]= \beta_0 + \beta_1 x_i\) for all \(i = 1,\ldots,n\) and \(E [\bar{Y}] = \beta_0 + \beta_1 \bar{x}\). \[\begin{align*} E [\hat\beta_{1, OLS} ] &= E \left[ \frac{\sum_{i=1}^n x_i Y_i - n \bar{x}\bar{Y}}{\sum_{i=1}^n x_i^2 - n \bar{x}^2} \right]\\ &= \frac{E \left[\sum_{i=1}^n x_i Y_i - n \bar{x}\bar{Y}\right]}{\sum_{i=1}^n x_i^2 - n \bar{x}^2} = \frac{\sum_{i=1}^n x_i E [Y_i] - n \bar{x} E [\bar{Y}]}{\sum_{i=1}^n x_i^2 - n \bar{x}^2}\\ &= \frac{\sum_{i=1}^n x_i (\beta_0 + \beta_1 x_i) - n \bar{x} (\beta_0 + \beta_1 \bar{x})}{\sum_{i=1}^n x_i^2 - n \bar{x}^2}\\ &= \frac{\sum_{i=1}^n \beta_1 x_i^2 - \beta_1 n \bar{x}^2}{\sum_{i=1}^n x_i^2 - n \bar{x}^2} = \beta_1 \,. \end{align*}\]

(b) No. We only used the linearity of expectation and the assumption that \(E[\epsilon_i] = 0\) for all \(i=1,\dots,n\).


  • Problem 41

(a) The Shapiro-Wilk test result indicates that we would reject the null hypothesis that the residuals are normally distributed. In addition, the residuals shown in the summary imply that the residual values are highly skewed. So…“violated.”

(b) The estimated slope is the \(t\) value times the standard error, so \(-4.624 \times 0.1189\) (which is \(-0.550\)).

(c) The number of degrees of freedom for a simple linear regression model is \(n-2\), so \(n = 28+2 = 30\).

(d) It is simply the square of the “Residual standard error,” or \(1.888^2\) (which is \(3.565\)).

(e) The slope is negative, so the data are negatively correlated, and the correlation estimated itself, \(R\), is the square root of “R-squared.” So the answer is \(-\sqrt{0.4329}\) or \(-(0.4329^{1/2})\) (which is \(-0.658\)).

Chapter 3

  • Problem 1

The median of a \(\mathcal{N}(2,4)\) is \(\tilde{\mu} = 2\), thus \(P(X_i > 2) = 0.5\) by inspection. Now, let \(Y\) be the number of values \(> 2\). Then \[\begin{align*} P(Y=m) = \binom{n}{m} \left(\frac12\right)^m \left(\frac12\right)^{n-m} = \binom{n}{m} \left(\frac12\right)^n \,. \end{align*}\]


  • Problem 2

(a) We fix the number of successes to \(s=1\), with the random variable being the number of tickets we need to buy before buying the one that allows us to win the raffle. There are two correct answers here: \(F\) is sampled from the geometric distribution, with \(p=0.4\), or from the negative binomial distribution, with \(s=1\) and \(p=0.4\).

(b) We have that \(W = 2-F\), so \[\begin{align*} P(W > 0) = P(2-F > 0) &= P(F < 2) = p_X(0) + p_X(1) \\ &= (0.4)^1(1-0.4)^0 + (0.4)^1(1-0.4)^1 = 0.4 + 0.24 = 0.64 \,. \end{align*}\]

(c) Let \(X\) be the number of winning tickets. Then \[\begin{align*} P(X = 1 \vert X \geq 1) &= \frac{P(X = 1 \cap X \geq 1)}{P(X \geq 1)} = \frac{P(X=1)}{1-P(X=0)} \\ &= \frac{\binom{2}{1}(0.4)^1(0.6)^1}{1 - \binom{2}{0}(0.4)^0(0.6)^2} = \frac{2 \cdot 0.24}{1 - 0.36} = 0.48/0.64 = 0.75 \,. \end{align*}\]


  • Problem 3

The first step is to write down the probability mass function for the number of insured drivers, given that the number of insured drivers is odd: \[\begin{align*} P(X=1 \vert X=1 \cup X=3) &= \frac{p_X(1)}{p_X(1)+p_X(3)} = \frac{\binom{3}{1}(1/2)^1(1/2)^2}{\binom{3}{1}(1/2)^1(1/2)^2 + \binom{3}{3}(1/2)^3(1/2)^0} \\ &= \frac{3}{3 + 1} = \frac34 \\ P(X=3 \vert X=1 \cup X=3) &= 1 - P(X=1 \vert X=1 \cup X=3) = \frac14 \,. \end{align*}\] So the expected value is \[\begin{align*} E[X \vert X=1 \cup X=3] = \sum_{1,3} x p_X(x \vert x=1 \cup x=3) = 1 \cdot 3/4 + 3 \cdot 1/4 = 3/2 \,. \end{align*}\]


  • Problem 4

(a) We are conducting a negative binomial experiment with \(s = 2\). The random variable is the number of failures…here, \(X = 2\). So: \[\begin{align*} p_X(2) = \binom{2+2-1}{2} \left(\frac{1}{2}\right)^2 \left(\frac{1}{2}\right)^2 = \frac{3!}{1!2!} \frac{1}{16} = \frac{3}{16} \,. \end{align*}\]

(b) The sum of the data is negatively binomially distributed for \(s=4\) successes and probability of success \(p=1/2\). The overall number of failures here is \(X = 1\). So \[\begin{align*} p_X(1) = \binom{1+4-1}{1} \left(\frac{1}{2}\right)^4 \left(\frac{1}{2}\right)^1 = \frac{4!}{1!3!} \frac{1}{32} = \frac{4}{32} = \frac18 \,. \end{align*}\]

(c) This is a negative binomial experiment, so \(E[X] = s(1-p)/p\), which decreases as \(p\) increases. Referring to the confidence interval reference table, we see that \(q = 1-\alpha = 0.9\).


  • Problem 5

(a) We have that \(f_X(x) = 3x^2\) and thus that \(F_X(x) = x^3\). Plugging these into the formula for \(f_{(j)}(x)\) (along with \(j=1\)) yields \[\begin{align*} f_{(1)}(x) = \frac{4!}{3!0!}(1-x^3)^3 3 x^2 = 12(1-x^3)^3x^2 \,, \end{align*}\] for \(x \in [0,1]\).

(b) We have that \[\begin{align*} E[X_{(4)}] = \int_0^1 x 12x^{11} = \int_0^1 12x^{12} = \left.\frac{12}{13}x^{13}\right|_0^1 = \frac{12}{13} \,. \end{align*}\]


  • Problem 6

(a) The cdf is \(F_X(x) = \int_0^x y dy = \frac{x^2}{2}\). Thus \[\begin{align*} f_{(3)} = 3x\left[F_X(x)\right]^{2} = 3x\frac{x^4}{4} = \frac{3}{4} x^5 \end{align*}\] for \(x \in [0,\sqrt{2}]\).

(b) The variance is \(V[X_{(3)}] = E[X_{(3)}^2] - (E[X_{(3)}])^2\), where \[\begin{align*} E[X_{(3)}] &= \int_0^{\sqrt{2}} x \left(\frac{3}{4} x^5\right)dx =\frac{3}{4} \frac{x^7}{7}\bigg|_0^{\sqrt{2}} = \frac{3\cdot 2^3}{28} \sqrt{2} = \frac{6}{7}\sqrt{2} = 1.212 \,, \end{align*}\] and \[\begin{align*} E[X_{(3)}^2] &= \int_0^{\sqrt{2}} x^2 \left(\frac{3}{4} x^5\right)dx =\frac{3}{4} \frac{x^8}{8}\bigg|_0^{\sqrt{2}} = \frac{3\cdot2^4}{32} = \frac{3}{2} \,. \end{align*}\] Thus \[\begin{align*} V[X_{(3)}] = \frac{3}{2} - \frac{36\cdot 2}{49} = \frac{147 - 144}{98} = \frac{3}{98} = 0.031 \,. \end{align*}\]


  • Problem 7

We have that \(f_X(x) = 1\), \(F_X(x) = x\), and \(j = \frac{n+1}{2} = 2\), so \[\begin{align*} f_{(2)}(x) = \frac{3!}{1!1!}x^1(1-x)^1(1) = 6x(1-x) \end{align*}\] for \(x \in [0,1]\). Therefore \[\begin{align*} P\left(\frac{1}{3} \leq X_{(2)} \leq \frac{2}{3}\right) &= \int_{1/3}^{2/3} 6x(1-x) dx = 6\left[ \frac{x^2}{2}\bigg|_{1/3}^{2/3} - \frac{x^3}{3}\bigg|_{1/3}^{2/3}\right]\\ &= 6\left[ \frac{1}{2} \left( \frac{4}{9} - \frac{1}{9}\right) - \frac{1}{3} \left( \frac{8}{27} - \frac{1}{27}\right) \right]\\ &= 6\left[ \frac{3}{18} - \frac{7}{81} \right] = 6\left[ \frac{27}{162} - \frac{14}{162} \right] = \frac{13}{27} = 0.481 \,. \end{align*}\]


  • Problem 8

We have that \(f_X(x) = e^{-x}\) for \(x \geq 0\) and thus that \(F_X(x) = 1-e^{-x}\) over the same domain. Thus \[\begin{align*} f_{(2)}(x) &= \frac{3!}{1!1!}(1 - e^{-x})^1\left[1 - (1- e^{-x}) \right]^1 e^{-x} = 6(1 - e^{-x})e^{-x}e^{-x} \\ &= 6(1 - e^{-x})e^{-2x} = 6e^{-2x} - 6e^{-3x} \,, \end{align*}\] and \[\begin{align*} E[X_{(2)}] &= \underbrace{\int_0^{\infty} 6xe^{-2x} dx}_{\text{by } y=2x, \, dy/2 = dx} - \underbrace{\int_0^{\infty} 6xe^{-3x} dx}_{\text{by } y=3x, \, dy/3 = dx} \\ &= \int_0^{\infty} \frac{3}{2}y e^{-y} dy - \int_0^{\infty} \frac{2}{3}y e^{-y} dy\\ &= \frac{3}{2} \Gamma(2) - \frac{2}{3}\Gamma(2) = \frac{3}{2} - \frac{2}{3} =\frac{5}{6} \,. \end{align*}\]


  • Problem 9

(a) A probability density function is the derivative of its associated cumulative distribution function, so \[\begin{align*} f_X(x) = \frac{d}{dx} x^3 = 3x^2 \,. \end{align*}\]

(b) The maximum order statistic has pdf \[\begin{align*} f_{(n)}(x) &= n f_X(x) [F_X(x)]^{n-1} \\ &= n (3x^2) [x^3]^{n-1} = 3n x^2 x^{3n-3} = 3n x^{3n-1} \,. \end{align*}\]

(c) We have that \[\begin{align*} F_{(n)}(x) = [F_X(x)]^n ~~ \Rightarrow ~~ F_{(n)}(x) = x^{3n} \,. \end{align*}\] We can also show this via integration: \[\begin{align*} F_{(n)}(x) = \int_0^x f_{(n)}(y) dy = \int_0^x 3n y^{3n-1} dy = \left. y^{3n}\right|_0^x = x^{3n} \,. \end{align*}\]

(d) The expected value is \[\begin{align*} E[X_{(n)}] = \int_0^1 x f_{(n)}(x) dx = \int_0^1 3n x^{3n} dx = \left. \frac{3n}{3n+1} x^{3n+1} \right|_0^1 = \frac{3n}{3n+1} \,. \end{align*}\]


  • Problem 10

(a) The cdf within the domain is \[\begin{align*} F_X(x) = \int_0^x \frac12 y dy = \left. \frac14 y^2 \right|_0^x = \frac{x^2}{4} \,. \end{align*}\]

(b) We plug \(n\), \(f_X(x)\), and \(F_X(x)\) into the order statistic pdf equation (where \(j = n = 2\)): \[\begin{align*} f_{(2)}(x) = 2 f_X(x) \left[ F_X(x) \right]^{2-1} \left[ 1 - F_X(x) \right]^{2-2} = 2 \left( \frac12 \right) x \left( \frac14 \right) x^2 = \frac14 x^3 \,. \end{align*}\]

(c) The expected value is \[\begin{align*} E[X_{(2)}] = \int_0^2 x f_{(2)}(x) dx = \int_0^2 \frac14 x^4 dx = \frac{1}{20} \left. x^5 \right|_0^2 = \frac{32}{20} = \frac85 = 1.6 \,. \end{align*}\]

(d) They cannot be independent: given the value of one, the other has to be either smaller (\(X_{(1)}\)) or larger (\(X_{(2)}\)).


  • Problem 11

Let \(X_1, \ldots, X_n\) denote the samples from the Bernoulli distribution. The log-likelihood is \[\begin{align*} \ell(X_1, \ldots, X_n | p) = \sum_{i = 1}^n [X_i \log(p) + (1 - X_i) \log (1-p)] \,. \end{align*}\] The first two derivatives are \[\begin{align*} \frac{d}{dp} \ell(X_1, \ldots, X_n | p) &= \sum_{i = 1}^n\bigg[ \frac{X_i}{p} - \frac{(1 - X_i)}{1-p}\bigg] \\ \frac{d^2}{dp^2} \ell(X_1,\ldots, X_n | p ) &= \sum_{i = 1}^n \bigg[-\frac{X_i}{p^2}- \frac{(1- X_i)}{(1-p)^2}\bigg] \,. \end{align*}\] The Fisher information is thus \[\begin{align*} I_n(p) = E\bigg[-\frac{d^2}{dp^2} \ell(X_1,\ldots, X_n | p )\bigg] &= \sum_{i=1}^n \bigg[\frac{E[X_i]}{p^2} + \frac{E[(1- X_i)]}{(1-p)^2}\bigg] \\ &= \sum_{i=1}^n \bigg[\frac{p}{p^2} + \frac{1-p}{(1-p)^2}\bigg] \\ &= \sum_{i=1}^n \bigg[\frac{1}{p} + \frac{1}{1-p}\bigg] \\ &= \sum_{i=1}^n \bigg[\frac{1-p}{p(1-p)} + \frac{p}{p(1-p)}\bigg] \\ &= \frac{n}{p(1-p)} \,, \end{align*}\] and the asymptotic distribution of the MLE is \(\mathcal{N}(p,\frac{p(1-p)}{n})\).


  • Problem 12

(a) The log-likelihood and its derivative are \[\begin{align*} \ell(p \vert \mathbf{x}) &= \log (1-p) \sum_{i=1}^n (x_i - 1) + n \log p\\ \ell'(p \vert \mathbf{x}) &= -\frac{\sum_{i=1}^n x_i - n}{1 - p} + \frac{n}{p} \,. \end{align*}\] Setting the derivative to zero, we find that \[\begin{align*} \frac{\sum_{i=1}^n x_i - n}{1 - p} &= \frac{n}{p} \\ \Rightarrow ~~~ \left(\sum_{i=1}^n x_i - n\right) p &= n (1 - p) \\ \Rightarrow ~~~ p \,\sum_{i=1}^n x_i &= n \\ \Rightarrow ~~~ \hat{p} &= \frac{1}{\bar{X}} \,. \end{align*}\] Using the invariance property of the MLE, we find that \(\widehat{1/p}_{MLE} = \bar{X}\).

(b) The variance of this estimator is \[\begin{align*} V\left[\widehat{1/p}_{MLE}\right] = V \left[ \frac{\sum_{i=1}^n X_i}{n} \right] = \frac{V[X]}{n} = \frac{1 - p}{np^2} \,. \end{align*}\]


  • Problem 13

The likelihood for \(p\) is \[\begin{align*} \mathcal{L}(p \vert \mathbf{x}) = \prod_{i=1}^n p_X(x_i \vert p) = \prod_{i=1}^n -\frac{1}{\log(1-p)} \frac{p^{x_i}}{x_i} = \underbrace{- \prod_{i=1}^n \frac{1}{x_i}}_{h(\mathbf{x})} \cdot \underbrace{\frac{1}{[\log(1-p)]^n} p^{\sum_{i=1}^n x_i}}_{g(p,\mathbf{x})} \,. \end{align*}\]
Given the expression for \(g(\cdot)\), we can see that a sufficient statistic for \(p\) is \(Y = \sum_{i=1}^n X_i\).


  • Problem 14

(a) The likelihood is \[\begin{align*} \mathcal{L}(a,b \vert \mathbf{x}) &= \prod_{i=1}^n a b x_i^{a-1} (1-x_i^a)^{b-1}\\ &= a^n b^n \left(\prod_{i=1}^n x_i\right)^{a-1} \left(\prod_{i=1}^n (1-x_i^a)\right)^{b-1} \end{align*}\] At first glance, it seems that we can take \[\begin{align*} \mathbf{Y} = \left\{ \prod_{i=1}^n x_i, \prod_{i=1}^n (1-x_i^a) \right\} \end{align*}\] as the joint sufficient statistics. However, note that the parameter \(a\) occurs in the second statistic. Because this second statistic includes a parameter value, it cannot be a sufficient statistic…and thus we conclude that we cannot identify joint sufficient statistics for \(a\) and \(b\).

(b) With \(a=1\) the density function becomes \(f_X(x) = b \cdot (1-x)^{b-1}\), with resulting likelihood \[\begin{align*} \mathcal{L}(b \vert \mathbf{x}) &= \prod_{i=1}^n b \cdot (1-x_i)^{b-1}\\ &= b^n \left(\prod_{i=1}^n (1-x_i)\right)^{b-1}\\ &= h(\mathbf{x}) g(b,\mathbf{x}) \,. \end{align*}\] Hence, a sufficient statistic for \(b\) is \(Y = \prod_{i=1}^n (1-X_i)\).


  • Problem 15

(a) The likelihood is \[\begin{align*} \mathcal{L}(\beta \vert \mathbf{x}) = \prod_{i=1}^n f_X(x_i \vert \beta) &= \prod_{i=1}^n \frac{x_i}{\beta^2} \exp\left(-\frac{x}{\beta}\right) \\ &= \underbrace{\prod_{i=1}^n x_i}_{h(\mathbf{x})} \cdot \underbrace{\frac{1}{\beta^{2n}} \exp\left(-\frac{1}{\beta}\sum_{i=1}^n x_i\right)}_{g(\beta,\mathbf{x})} \,. \end{align*}\] We can examine \(g(\cdot)\) and immediately identify that a sufficient statistic for \(\beta\) is \(Y = \sum_{i=1}^n X_i\).

(b) We have that \[\begin{align*} E[Y] = E[\sum_{i=1}^n X_i] = \sum_{i=1}^n E[X_i] = \sum_{i=1}^n 2\beta = 2n\beta \,. \end{align*}\] Hence \[\begin{align*} E\left[\frac{Y}{2n}\right] = \beta \end{align*}\] and \(\hat{\beta}_{MVUE} = Y/2n = \bar{X}/2\).

(c) Utilizing the general rule from 235: \[\begin{align*} V[\hat{\beta}_{MVUE}] = V\left[\frac{\bar{X}}{2}\right] = \frac{V[\bar{X}]}{4} = \frac{V[X]}{4n} = \frac{2\beta^2}{4n} = \frac{\beta^2}{2n} \,. \end{align*}\]

(d) The first step is to write down the log-likelihood for one datum: \[\begin{align*} \ell(\beta \vert x) = \log f_X(x \vert \beta) = \log x - \frac{x}{\beta} - 2\log\beta \,. \end{align*}\] We take the first two derivatives: \[\begin{align*} \frac{d\ell}{d\beta} &= \frac{x}{\beta^2} - \frac{2}{\beta} \\ \frac{d^2\ell}{d\beta^2} &= -\frac{2x}{\beta^3} + \frac{2}{\beta^2} \,, \end{align*}\] and then compute the expected value: \[\begin{align*} I(\beta) = E\left[ \frac{2X}{\beta^3} - \frac{2}{\beta^2} \right] = \frac{2}{\beta^3}E[X] - \frac{2}{\beta^2} = \frac{2}{\beta^3}(2\beta) - \frac{2}{\beta^2} = \frac{2}{\beta^2} \,. \end{align*}\] Thus \(I_n(\beta) = (2n)/\beta^2\) and the CRLB is \(1/I_n(\beta) = \beta^2/(2n)\). The MVUE achieves the CRLB.


  • Problem 16

(a) We can factorize the likelihood as follows: \[\begin{align*} \mathcal{L}(\theta \vert \mathbf{x}) = \prod_{i=1}^n \frac{1}{\theta} e^{x_i} e^{-e^{x_i}/\theta} = e^{\sum_{i=1}^n x_i} \theta^{-n} e^{-(\sum_{i=1}^n e^{x_i})/\theta} \,. \end{align*}\] The first term does not contain \(\theta\) and thus can be ignored. Thus we identify \(Y = \sum_{i=1}^n e^{X_i}\) as a sufficient statistic.

(b) We can determine \(E[Y]\) by noticing that \(Y \sim\) Gamma\((n,\theta)\), as stated in the question…so \(E[Y] = n\theta\) and \(E[Y/n] = \theta\). Thus the MVUE for \(\theta\) is \((\sum_{i=1}^n e^{X_i})/n\).

(c) The MVUE will be a function of the sufficient statistic for \(\theta\), so let’s try \((\sum_{i=1}^n e^{X_i})^2\): \[\begin{align*} E\left[\left(\sum_{i=1}^n e^{X_i}\right)^2\right] = V\left[\left(\sum_{i=1}^n e^{X_i}\right)\right] + E\left[\sum_{i=1}^n e^{X_i}\right]^2 = n \theta^2 + (n\theta)^2 = n(n+1)\theta^2 \,. \end{align*}\] Therefore \((\sum_{i=1}^n e^{X_i})^2/(n(n+1))\) is the MVUE for \(\theta^2\).


  • Problem 17

(a) We factorize the likelihood: \[\begin{align*} \mathcal{L}(a \vert \mathbf{x}) = \prod_{i=1}^n \sqrt{\frac{2}{\pi}} \frac{x_i^2}{a^3} e^{-x_i^2/(2a^2)} = \left[ \left(\frac{2}{\pi}\right)^{n/2} \left( \prod_{i=1}^n x_i^2 \right) \right] \cdot \left[ \frac{1}{a^{3n}} e^{-(\sum_{i=1}^n x_i^2)/(2a^2)} \right] = h(\mathbf{x}) \cdot g(a,\mathbf{x}) \,. \end{align*}\] We can read off from the \(g(\cdot)\) function term that \(Y = \sum_{i=1}^n X_i^2\). (Including the minus sign, for instance, is fine because a function of a sufficient statistic is itself sufficent and we will get to the same MVUE in the end.)

(b) We utilize the shortcut formula: \[\begin{align*} E[X^2] = V[X] + (E[X])^2 = a^2 \frac{(3 \pi - 8)}{\pi} + (2a)^2 \frac{2}{\pi} = 3 a^2 + \frac{8 a^2}{\pi} - \frac{8 a^2}{\pi} = 3 a^2 \,. \end{align*}\]

(c) We compute the expected value for \(Y\): \[\begin{align*} E[Y] = E\left[\sum_{i=1}^n X_i^2\right] = \sum_{i=1}^n E[X_i^2] = n E[X^2] = 3 n a^2 \,. \end{align*}\] Thus the expected value for \(Y/(3n)\) is \(a^2\): \[\begin{align*} \widehat{a^2}_{MVUE} = \frac{1}{3n} \sum_{i=1}^n X_i^2 \,. \end{align*}\]

(d) There is no invariance principle for the MVUE. Maybe the desired result holds and maybe it doesn’t, but we cannot simply state that it does.


  • Problem 18

We are constructing an upper-tail test where the test statistic is trivially \(Y = X\). (So the NP Lemma does not really come into play here, given the lack of choices for the test statistic.) The expected value of \(Y\) is \[\begin{align*} E[Y] = \int_0^2 y f_Y(y) dy = \int_0^2 \frac{\theta}{2^\theta} y^{\theta} dy = \left. \frac{\theta}{2^\theta} \frac{y^{\theta+1}}{\theta+1} \right|_0^2 = \frac{2\theta}{\theta+1} \,. \end{align*}\] \(E[Y]\) increases as \(\theta\) increases, so we will be on the “yes” line of the hypothesis test reference table. Hence the rejection region will be of the form \[\begin{align*} y_{\rm obs} > F_Y^{-1}(1-\alpha \vert \theta_o) \,. \end{align*}\] The cdf \(F_Y(y)\) is \[\begin{align*} F_Y(y) = \int_0^y \frac{\theta}{2^\theta} u^{\theta-1} du = \left. \frac{u^\theta}{2^\theta}\right|_0^y = \left(\frac{y}{2}\right)^\theta \,, \end{align*}\] and the inverse cdf \(F_Y^{-1}(q)\) is \(y = 2q^{1/\theta}\). Hence the test we seek rejects the null hypothesis if \[\begin{align*} y_{\rm obs} > 2(1-\alpha)^{1/\theta_o} \,. \end{align*}\] The rejection-region boundary does not depend on \(\theta_a\), so we know that the test is the most powerful one for all alternative values \(\theta_a > \theta_o\)…thus it is a uniformly most powerful test.


  • Problem 19

(a) The likelihood is \[\begin{align*} \mathcal{L}(\beta \vert \mathbf{x}) = \prod_{i=1}^n \frac{\theta}{\beta}x_i^{\theta-1}\exp\left(-\frac{x_i^\theta}{\beta}\right) = \left[ \theta x_i^{\theta-1} \right] \cdot \left[ \frac{1}{\beta} \exp\left(-\frac{1}{\beta} x_i^\theta \right) \right] = h(\mathbf{x}) \cdot g(\beta,\mathbf{x}) \,, \end{align*}\] thus a sufficient statistic for \(\beta\) is \(Y = \sum_{i=1}^n X_i^\theta\).

(b) We are given that \(X^\theta \sim\) Exp(\(\beta\)). The mgf for an exponential distribution is \[\begin{align*} m_X(t) = (1 - \theta t)^{-1} \,, \end{align*}\] and hence the mgf for \(Y = \sum_{i=1}^n X_i\) will be \[\begin{align*} m_Y(t) = \prod_{i=1}^n (1 - \theta t)^{-1} = \left[ (1 - \theta t)^{-1} \right]^n = (1 - \theta t)^{-n} \,. \end{align*}\] Following the hint given in the question, we find that \(Y\) is a gamma-distributed random variable with “shape” parameter \(n\) and “scale” parameter \(\theta\). (There are two common parameterizations of the gamma distribution\(-\)shape/scale and shape/rate\(-\)and it is imperative to determine the correct one! This will impact the answer to part (c).)

(c) The statistic \(Y\) has expected value \(E[Y] = n\theta\), which increases with \(\theta\). Hence we utilize the upper-tail/yes line of the hypothesis test reference table: \(y_{\rm RR} = F_Y^{-1}(1 - \alpha \vert \theta_o)\), or, in code,

y.rr <- qgamma(1-alpha,shape=n,scale=theta)

  • Problem 20

(a) The moment-generating function for the random variable \(X\) is \[\begin{align*} m_X(t) = E\left[e^{tX}\right] &= \int_b^\infty e^{tx} \frac{1}{\theta} e^{-(x-b)/\theta} dx \\ &= e^{b/\theta} \frac{1}{\theta} \int_b^\infty e^{-x(1/\theta - t)} dx \\ &= e^{b/\theta} \frac{1}{\theta} \frac{e^{-b(1/\theta - t)}}{(1/\theta-t)} \\ &= e^{bt} (1-t\theta)^{-1} \,. \end{align*}\] (Here we make the implicit assumption that \(t < 1/\theta\), so that the integral evaluated at \(\infty\) is zero.) This is the final answer, but recall that when \(X = U+b\), \(m_X(t) = e^{bt} m_U(t)\). Since we recognize that \((1-t\theta)^{-1}\) is the mgf for an exponential distribution, we can state that \(U = X-b\) is an exponentially distributed random variable.

(b) The mgf for \(Y = \sum_{i=1}^n X_i\) is \[\begin{align*} m_Y(t) = \prod_{i=1}^n e^{bt} (1-t\theta)^{-1} = \left[ e^{bt} (1-t\theta)^{-1} \right]^n = e^{nbt} (1-t\theta)^{-n} \,. \end{align*}\]

(c) Going back to our answer for (a) (and our answer for the previous problem), we recognize that the mgf for \(\sum_{i=1}^n U_i\) is \((1-t\theta)^{-n}\), which is the mgf for a gamma distribution with shape parameter \(n\) and scale parameter \(\theta\). Hence \(Y' = Y - nb \sim \text{Gamma}(n,\theta)\).

(d) We are on the lower-tail/yes line of the hypothesis test reference table: \(y_{\rm RR}' = F_Y^{-1}(\alpha \vert \theta_o)\), or, in code,

y.rr.prime <- qgamma(alpha,shape=n,scale=theta)

We would reject the null hypothesis if \(y_{\rm obs} - nb < y_{\rm RR}'\). Because this test is constructed using a sufficient statistic and because no value of the alternative hypothesis appears in the definition of the rejection region, we indeed have defined a uniformly most powerful test of \(H_o : \theta = \theta_o\) versus \(H_a : \theta < \theta_o\).


  • Problem 21

(a) Let’s first find a sufficient statistic: \[\begin{align*} \mathcal{L}(\theta \vert \mathbf{x}) = \prod_{i=1}^n \theta e^{-\theta x_i} = \theta^n e^{-\theta \sum_{i=1}^n x_i} \,. \end{align*}\] A sufficient statistic is \(Y = \sum_{i=1}^n X_i\). We are conducting a lower-tail test, and since \(E[X] = 1/\theta\) decreases as \(\theta\) increases, we are on the “no” line of the reference table. We reject the null if \(y_{\rm obs} = \sum_{i=1}^n x_i > y_{\rm RR}\).

(b) \(\theta_o\) is plugged in to compute the rejection-region boundary, but \(\theta_a\) does not appear at all. Hence the defined test is uniformly most powerful, since it is most powerful for any value of \(\theta_a < \theta_o\).


  • Problem 22

(a) The sampling distribution is Binom(\(nk,p\)). We can determine this using the method of moment-generating functions, if necessary.

(b) \(E[Y] = nkp\) increases with \(p\), so we are on the upper-tail/“yes” line of the hypothesis test reference tables. The rejection-region boundary is given by \(F_Y^{-1}(1-\alpha \vert \theta_o)\), or, in code, with \(p_o\) in place of \(\theta_o\),

qbinom(1-alpha,n*k,p.o)

(c) For an upper-tail/yes test, the \(p\)-value is \(1 - F_Y(y_{\rm obs} \vert \theta_o)\). In code, with \(p_o\) in place of \(\theta_o\), the \(p\)-value is

1 - pbinom(y.obs,n*k,p.o)

However, we have to apply a discreteness correction, because otherwise we will not be summing over the correct range of \(y\) values, i.e., our \(p\)-value will be wrong. Here, that factor is \(-1\), applied to the input. So…

1 - pbinom(y.obs-1,n*k,p.o)

is the final answer.


  • Problem 23

This is straightforward if we remember to set the link function to the equation for the line: \[\begin{align*} -(Y \vert x)^{-1} = \beta_0 + \beta_1 x ~~~ \Rightarrow ~~~ (Y \vert x)^{-1} = -\beta_0 - \beta_1 x ~~~ \Rightarrow ~~~ Y \vert x = (-\beta_0 - \beta_1 x)^{-1} \,. \end{align*}\]


  • Problem 24

(a) The degrees of freedom for the residual deviance is \(n-p\), where \(p\) is the number of parameters (here, two: \(\beta_0\) and \(\beta_1\)). Hence \(n = 32\).

(b) \(\beta_1\) is set to zero to compute the null deviance. So \(-2\log\mathcal{L}_{\rm max} = 43.230\).

(c) The odds are \(O(x) = \exp(\hat{\beta}_0 + \hat{\beta}_1 x)\) or just \(\exp(\hat{\beta}_0)\) for \(x = 0\), meaning thet \(O(x=0) = \exp(12.040)\).

(d) The estimated slope \(\hat{\beta}_1\) is negative, and we know that \(O(x+1) = O(x) \exp(\hat{\beta}_1)\), so we know that \(O(x+1) < O(x)\)…the odds decrease as \(x\) increases.


  • Problem 25

(a) We have that \[\begin{align*} O(x) = \frac{p \vert x}{1 - p \vert x} = \frac{0.1}{1-0.1} = \frac19 = 0.111 \,. \end{align*}\]

(b) The new odds are \[\begin{align*} O(589+100) = \exp(\hat{\beta}_0 + \hat{\beta_1}(589+100)) = O(589) \exp(100\hat{\beta}_1) = \frac19 \exp(0.14684) = 0.129 \,. \end{align*}\]

(c) We have that \(Y_1 = 0\) and \(\hat{Y}_i = 0.07\), so \[\begin{align*} d_1 &= \mbox{sign}(Y_1-\hat{Y}_1)\sqrt{-2[Y_1\log\hat{Y}_1+(1-Y_1)\log(1-\hat{Y}_1)]} = \mbox{sign}(-0.07) \sqrt{-2\log(0.93)} \\ &= -\sqrt{-2\log(0.93)} = 0.381 \,. \end{align*}\]

(d) The null deviance is computed assuming \(\beta_1 = 0\). This model lies “farther” from the observed data than the model with \(\hat{\beta}_1 = 0.00147\), meaning it deviates more from the data, meaning that the deviance would be higher.


  • Problem 26

Let’s start by collecting the basic pieces of information that we would combine in a Naive Bayes regression model: \[\begin{align*} p(0) = 3/5 ~~\mbox{and}~~ p(1) = 2/5 \,, \end{align*}\] where \(0\) and \(1\) are the two response (i.e., \(Y\)) values. Next up, the conditionals: \[\begin{align*} p(x1 = N \vert 0) = 2/3 ~~ &\mbox{and}& ~~ p(x1 = Y \vert 0) = 1/3 \\ p(N \vert 1) = 1/2 ~~ &\mbox{and}& ~~ P(Y \vert 1) = 1/2 \\ \\ p(x2 = T \vert 0) = 2/3 ~~ &\mbox{and}& ~~ p(x2 = F \vert 0) = 1/3 \\ p(T \vert 1) = 1/2 ~~ &\mbox{and}& ~~ P(F \vert 1) = 1/2 \,. \end{align*}\] The estimated probability of observing a datum of Class 0 given Y and F is thus \[\begin{align*} p(0 \vert Y,F) &= \frac{p(Y \vert 0) p(F \vert 0) p(0)}{p(Y \vert 0) p(F \vert 0) p(0) + p(Y \vert 1) p(F \vert 1) p(1)} \\ &= \frac{1/3 \cdot 1/3 \cdot 3/5}{1/3 \cdot 1/3 \cdot 3/5 + 1/2 \cdot 1/2 \cdot 2/5} \\ &= \frac{1/15}{1/15 + 1/10} = \frac{2/30}{2/30+3/30} = \frac{2}{5} \,. \end{align*}\]


  • Problem 27

(a) The pdf is of the form \[\begin{align*} k x^{\alpha-1} (1-x)^{\beta-1} \,, \end{align*}\] with \(0 \leq x \leq 1\), so what we have is a beta distribution: \(X \sim\) Beta\((1,2)\).

(b) We have that \[\begin{align*} E[X] = \frac{\alpha}{\alpha+\beta} = \frac{1}{3} \end{align*}\] and \[\begin{align*} E[X^2] = V[X] + (E[X])^2 = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} + \left(\frac{1}{3}\right)^2 = \frac{2}{36} + \frac{4}{36} = \frac{1}{6} \,. \end{align*}\]

(c) These expressions are straightforward to evaluate: \[\begin{align*} E[C] = E[10X] = 10E[X] = \frac{10}{3} = 3.333 \end{align*}\] and \[\begin{align*} V[C] = E[C^2] - (E[C])^2 = E[100X^2] - \frac{100}{9} = 100E[X^2] - \frac{100}{9} = \frac{150}{9} - \frac{100}{9} = \frac{50}{9} = 5.556 \,. \end{align*}\]


  • Problem 28

(a) \(f_X(x) = 12x - 24x^2 + 12x^3 = 12x(1-x)^2\) for \(x \in [0,1]\)…so this is a Beta(2,3) distribution.

(b) \(X \sim {\rm Beta}(2,3) \Rightarrow E[X] = \alpha/(\alpha+\beta) = 2/(2+3) = 2/5 = 0.4\) ,.


  • Problem 29

One way to solve this problem is to utilize the shortcut formula: \(E[X^2] = V[X] + E[X]^2\). With this in hand: \[\begin{align*} V[X] = \frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} = \frac{6}{25 \cdot 6} = \frac{1}{25} \end{align*}\] and \[\begin{align*} E[X] = \left(\frac{\alpha}{\alpha + \beta}\right)^2 = \left(\frac{2}{5}\right)^2 = \frac{4}{25} \,. \end{align*}\] Therefore, \[\begin{align*} E[X^2] = \frac{1}{25} + \frac{4}{25} = \frac{1}{5} \,. \end{align*}\] A second way to solve this problem is by brute-force integration: \[\begin{align*} E[X^2] &= \int_0^1 x^2 \frac{x(1-x)^2}{B(2,3)}dx = \int_0^1 \frac{x^3(1-x)^2}{B(2,3)}dx = \int_0^1 \frac{x^3(1-x)^2}{B(2,3)} \frac{B(4,3)}{B(4,3)}dx \\ &= \frac{B(4,3)}{B(2,3)}\underbrace{\int_0^1 \frac{x^3(1-x)^2}{B(4,3)}dx}_{=1} \\ &= \frac{B(4,3)}{B(2,3)} = \frac{\Gamma(4) \Gamma(3)}{\Gamma(7)}\frac{\Gamma(5)}{\Gamma(2)\Gamma(3)}\ = \frac{\Gamma(4) \Gamma(5)}{\Gamma(2)\Gamma(7)} = \frac{3! 4!}{1!6!} = \frac{1}{5} \,. \end{align*}\]


  • Problem 30

(a) The median is the second sampled datum. The pdf for \(X\) is \(f_X(x) = 3x^2\) and \(F_X(x) = x^3\), both for \(x \in [0,1]\). Thus \[\begin{align*} f_{(2)}(x) = \frac{3!}{1!1!} [x^3]^1 [1 - x^3]^1 3x^2 = 18 x^5 (1 - x^3) \,, \end{align*}\] for \(x \in [0,1]\), and \[\begin{align*} E[X_{(2)}] &= \int_0^1 x 18 x^5 (1-x^3) dx = 18 \int_0^1 (x^6 - x^9) dx \\ &= 18 \left( \left.\frac{x^7}{7}\right|_0^1 - \left.\frac{x^{10}}{10}\right|_0^1 \right) = 18 \left( \frac{1}{7}-\frac{1}{10} \right) = \frac{18 \cdot 3}{70} = \frac{27}{35} \,. \end{align*}\]


  • Problem 31

(a) This is a Beta(2,2) distribution.

(b) We can determine \(c\) via brute-force integration: \[\begin{align*} c \int_0^1 x(1-x) dx = c\left[ \int_0^1 x dx - \int_0^1 x^2 dx \right] &= c \left[ \left.\frac{x^2}{2}\right|_0^1 - \left.\frac{x^3}{3}\right|_0^1 \right] \\ &= c \left[ \frac{1}{2} - \frac{1}{3} \right] \\ &= c \frac{1}{6} = 1 ~~\Rightarrow~~ c = 6 \,. \end{align*}\] Alternatively, we can recognize that \[\begin{align*} c &= \frac{1}{B(2,2)} = \frac{\Gamma(4)}{\Gamma(2) \Gamma(2)} = \frac{3!}{1! 1!} = 6 \,. \end{align*}\]

(c) Since \(\alpha = \beta\), the distribution is symmetric around \(x = 1/2\), which is its mean value.

(d) We have that \[\begin{align*} P(X \leq 1/4 \vert X \leq 1/2) &= \frac{P(X \leq 1/4 \cap X \leq 1/2)}{P(X \leq 1/2)} = \frac{P(X \leq 1/4)}{P(X \leq 1/2)} = \frac{P(X \leq 1/4)}{1/2} \\ &= 2P(X \leq 1/4) = 2 \int_0^{1/4} 6 x (1-x) dx = 12 \int_0^{1/4} x (1-x) dx \\ &= 12 \left[ \left.\frac{x^2}{2}\right|_0^{1/4} - \left.\frac{x^3}{3}\right|_0^{1/4} \right] = 12 \left[ \frac{1}{32} - \frac{1}{192} \right] = 12 \frac{5}{192} = \frac{60}{192} = \frac{30}{96} = \frac{5}{16} \,. \end{align*}\]


  • Problem 32

(a) We carry out a chi-square goodness-of-fit test: \[\begin{align*} W = \sum_{i=1}^n \frac{(X_i - kp_i)^2}{kp_i} = \frac{1}{7}[(10-7)^2+(5-7)^2+(6-7)^2] = 2 \,. \end{align*}\]

(b) There are \(m=3\) outcomes, but we lose one degree of freedom because of the constraint that \(\sum_{i=1}^m X_i = 21\), so the number of degrees of freedom is 2.

(c) We are given that \(\alpha = 0.1\), so we want \(F_W^{-1}(0.9) = 4.61\).


  • Problem 33

(a) The correct answer is homogeneity, since the researcher is splitting the respondents into groups by giving them the different types of leaflets to read, and the goal is to determine the willingness to spend government funding is homogenous across the type of pamphlets.

(b) Since chi-square statistics involving summing squared differences over all combinations of leaflets and spending opinions, there are \(12 = 3 \cdot 4\) terms.

(c) Since the all the information but one for each factor is sufficent, the test statistics will follow a chi-square distribution with \(6 = (3-1) \cdot (4-1)\) degrees of freedom.

(d) The correct answer is (i) since \(p\)-value is defined as a probability of events at least as extreme as the what actually observed under the null hypothesis, and larger values of the test statistic here correspond to larger differences between what we would expect under the null and what we observe.


  • Problem 34

(a) The exponential distribution with mean 1 is \(e^{-x}\), for \(x \geq 0\). Therefore, the probabilities for arriving between 1 and 2 minutes after the previous person is given by \[\begin{align*} \int_1^2 e^{-x} dx = -\left.e^{-x}\right|_1^2 = 0.233 \,. \end{align*}\]

(b) The probabilities for the other two “bins” are \[\begin{align*} \int_0^1 e^{-x} dx &= -\left.e^{-x}\right|_0^1 = 0.632 \\ \int_2^\infty e^{-x} dx &= 1 - 0.632 - 0.233 = 0.135 \,. \end{align*}\] We can now carry out a chi-square goodness-of-fit test: \[\begin{align*} W = \frac{(52-63.2)^2}{63.2} + \frac{(23-23.3)^2}{23.3} + \frac{(25-13.5)^2}{13.5} = 1.985 + 0.004 + 9.796 = 11.785 \,, \end{align*}\] and \(W \sim \chi_2^2\). The rejection region is \(W > w_{\rm RR} = 5.991\) (i.e., qchisq(0.95,2)), and the \(p\)-value is 0.0028 (i.e., 1-pchisq(11.785,2)). We have sufficient evidence to reject the null hypothesis and conclude that the time elapsed between people walking through a particular door is not exponentially distributed with mean 1.

(c) If \(n = 10\) and there are 3 bins, then there is no way that \(np_i \geq 5\) for all bins. Thus the chi-square goodness-of-fit test should not be applied.


  • Problem 35

(a) We have that \(p_{\rm out} = 8/9\) and \(p_{\rm in} = 1/9\), so \(kp_{\rm out} = 180 (8/9) = 160\) and \(kp_{\rm in} = 180 (1/9) = 20\).

(b) We perform a chi-square goodness-of-fit test: \[\begin{align*} W = \frac{(150-160)^2}{160} + \frac{(30-20)^2}{20} = \frac{100}{160} + \frac{100}{20} = \frac58 + 5 = 5.625~(\mbox{or}~5~5/8) \,. \end{align*}\]

(c) \(W\) is sampled from a chi-square distribution for \(2-1 = 1\) degree of freedom.

(d) The rejection region for a chi-square GoF test is \(W > w_{\rm RR}\), so, since 5.625 is greater than 3.841, we would reject the null hypothesis.


  • Problem 36

When the null hypothesis is true, the \(p\)-value is sampled uniformly between 0 and 1. Hence the probability of rejecting the null is \(P(p \leq \alpha) = \alpha\). When we collect \(k\) \(p\)-values, we are fixing the number of trials, so \(X\), the number of \(p\)-values observed to be \(\leq \alpha\), will be binomially distributed, and thus \[\begin{align*} P(X = x) = p_X(x) = {k \choose x} \alpha^x (1-\alpha)^{k-x} \,. \end{align*}\]


  • Problem 37

This is an moment-generating function problem. For a geometric distribution, \[\begin{align*} m_{X_i}(t) = \frac{p}{[1 - (1-p)e^t]} \,. \end{align*}\] As for the sum, \[\begin{align*} m_{Y}(t) = \prod_{i=1}^3 m_{X_i}(t) = \left(\frac{p}{[1 - (1-p)e^t]}\right)^3 \,, \end{align*}\] which is the mgf for a negative binomial distribution; specifically, \(Y \sim\) NBinom(\(3,p\)).


  • Problem 38

The moment-generating functions for \(X_1\) and \(X_2\) are \[\begin{align*} m_{X_1}(t) &= E[e^{tX}] = e^{ta}p(a) = e^{at} \\ m_{X_2}(t) &= (1-p) + pe^t \,. \end{align*}\] As for the sum, \[\begin{align*} m_Y(t) = m_{X_1}(t) m_{X_2}(t) = e^{at}((1-p) + pe^t) = p e^{(a+1)t} + (1-p)e^{at} \,. \end{align*}\] Given this, \[\begin{align*} E[Y] &= \frac{d m_Y(t)}{dt}\bigg|_0 = p(a+1)e^{(a+1)t} + (1-p)ae^{at} \bigg|_0 = p(a+1) + (1-p)a = a+p \\ E[Y^2] &= \frac{d^2 m_{S_n}(t)}{dt^2}\bigg|_0 = p(a+1)^2 e^{(a+1)t} + (1-p)a^2e^{at}\bigg|_0 = p(a+1)^2 + (1-p)a^2 = a^2+2ap+p \\ \Rightarrow V[Y] &= E[Y^2] - (E[Y])^2 = a^2+2ap+p - (a^2+2ap + p^2) = p(1-p) \,. \end{align*}\]


  • Problem 39

(a) The outcomes are discrete, we sample with replacement, there are two outcomes (either \(\leq 3\) or \(> 3\)), and the number of trials is fixed…so we are dealing with the binomial distribution: \[\begin{align*} P(Z > 3) &= 1 - P(Z \leq 3) = 1 - \Phi(3) = p \\ \Rightarrow P(X = 1) &= {100 \choose 1}(1 - \Phi(3))^1(1-1+\Phi(3))^{99} \\ &= 100(1-\Phi(3))\Phi(3)^{99} ~~{\rm or}~~ 100\Phi(-3)\Phi(3)^{99} \,. \end{align*}\]

(b) We have that \[\begin{align*} T \sim {\rm Geom}(p=1-\Phi(3)) ~~~ \Rightarrow ~~~ E[T] = 1/p = 1/(1-\Phi(3)) = 1/\Phi(-3) \,. \end{align*}\]


  • Problem 40

(a) This is an order statistics problem. We know that \[\begin{align*} f_X(x) = e^{-x} ~~~\mbox{and}~~~ F_X(x) = 1 - e^{-x} \end{align*}\] for \(x \geq 0\) and that \(n = 2\). We plug this information into the formula for the pdf of the minimum value: \[\begin{align*} f_{(1)}(x) = nf_X(x)[1-F_X(x)]^{n-1} = 2e^{-x}[e^{-x}]^1 = 2e^{-2x} \end{align*}\] for \(x \geq 0\).

(b) The moment-generating function is \[\begin{align*} m_{X_{(1)}}(t) = E[e^{tX}] = 2 \int_0^\infty e^{tx} e^{-2x} dx = 2\int_0^\infty e^{-(2-t)x} dx = \frac{2}{2-t} = \frac{1}{1-t/2} \,. \end{align*}\]

(c) The form of the pdf is that of an exponential with \(\beta = 1/2\); you could also infer this from the form of the mgf (which for an exponential is \(1/(1-\beta t)\)). So: \(X_{(1)} \sim\) Exponential(1/2).


  • Problem 41

It is simpler to do this problem if we compute \(E[1+X]\): \[\begin{align*} E[1+X] &= \int_0^1 (1+x) \frac{x^{\alpha-1}(1+x)^{-\alpha-\beta}}{B(\alpha,\beta)} dx \\ &= \int_0^1 \frac{x^{\alpha-1}(1+x)^{-\alpha-(\beta-1)}}{B(\alpha,\beta)} dx \,. \end{align*}\] In the numerator, \(\beta \rightarrow (\beta-1)\)…so we need to change \(B(\alpha,\beta) \rightarrow B(\alpha,\beta-1)\) in the denominator: \[\begin{align*} E[1+X] &= \int_0^1 \frac{B(\alpha,\beta-1)}{B(\alpha,\beta-1)} \frac{x^{\alpha-1}(1+x)^{-\alpha-(\beta-1)}}{B(\alpha,\beta)} dx \\ &= \frac{B(\alpha,\beta-1)}{B(\alpha,\beta)} \int_0^1 \frac{x^{\alpha-1}(1+x)^{-\alpha-(\beta-1)}}{B(\alpha,\beta-1)} dx \\ &= \frac{B(\alpha,\beta-1)}{B(\alpha,\beta)} \\ &= \frac{\Gamma(\alpha)\Gamma(\beta-1)}{\Gamma(\alpha+\beta-1)} \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} \\ &= \frac{\Gamma(\beta-1)}{\Gamma(\alpha+\beta-1)} \frac{(\alpha-\beta-1)\Gamma(\alpha+\beta-1)}{(\beta-1)\Gamma(\beta-1)} \\ &= \frac{\alpha-\beta-1}{\beta-1} = \frac{\alpha}{\beta-1} + 1 \,. \end{align*}\] Hence \(E[X] = E[1+X] - 1 = \alpha/(\beta-1)\).


  • Problem 42

(a) We are conducting a chi-square goodness-of-fit test with expected counts \(kp_1 = kp_2 = 24/4 = 6\) and \(kp_3 = 12\): \[\begin{align*} W = 2 \times \frac{(8-6)^2}{6} + \frac{(8-12)^2}{12} = \frac{4}{3} + \frac{4}{3} = \frac{8}{3} \,. \end{align*}\]

(b) The number of degrees of freedom is \(m-1 = 2\) and the rejection region boundary is given by

qchisq(0.95,2)
## [1] 5.991465

(c) The \(p\)-value is given by

1 - pchisq(8/3,2)
## [1] 0.2635971

(d) The data \(\{8,8,8\}\) are sampled according to a multinomial distribution.

Chapter 4

  • Problem 1

(a) This is a Poisson problem: \(X \sim\) Poisson(\(\lambda = 2 \cdot 1/4 = 1/2\)). So \[\begin{align*} \mu = E[X] = \lambda = 1/2 ~~\text{and}~~ \sigma = \sqrt{V[X]} = \sqrt{\lambda} = \sqrt{1/2} \approx 0.707 \,. \end{align*}\] Thus \[\begin{align*} P(1/2 \leq X \leq 1/2+1.414) = p(1) = \frac{\lambda^1}{1!} e^{-\lambda} = \frac12 e^{-1/2} = 0.303 \,. \end{align*}\]

(b) We have that \(X \sim\) Exp(\(\beta = 1/2\)) (since there is a half-hour on average between calls). By the memorylessness property, \(P(X > 1/2 \vert X > 1/4) = P(X > 1/4)\). Thus \[\begin{align*} P(X > 1/4) = \int_{1/4}^\infty \frac{1}{\beta} e^{-x/\beta} dx = \int_{1/4}^\infty 2 e^{-2x} dx = \left.-e^{-2x}\right|_{1/4}^\infty = e^{-1/2} = 0.607 \,. \end{align*}\]

(c) The overall time \(T\) is \(10X + (10-X)\), where \(X\), the number of calls from the friend, is sampled from a binomial distribution with \(k = 10\) and \(p = 0.2\). Thus the average total number of minutes is \[\begin{align*} E[T] = E[10X + (10-X)] = E[9X+10] = 9E[X] + 10 = 9kp + 10 = 28 \,. \end{align*}\]


  • Problem 2

(a) The number of (successful) shots can be infinite; only on average is the number of shots in eight minutes going to be four. So we are working with a Poisson distribution whose parameter \(\lambda\) (the expected number of successful shots) is \[\begin{align*} \lambda = (1 \quad \frac{\text{shot}}{2\text{min}})(\frac{1}{2} \frac{\text{success}}{\text{shot}})(4 \quad 2\text{min}) = 2 \,. \end{align*}\] Let \(X\) be the number of successful shots. Then \[\begin{align*} P(X \leq 1) = \frac{\lambda^0}{0!}e^{-\lambda} + \frac{\lambda^1}{1!}e^{-\lambda} = e^{-2}(1+2) = 3e^{-2} \,. \end{align*}\]

(b) We know that \(E[X] = \lambda = 2\), \(V[X] = \lambda = 2\), and \(\sigma = \sqrt{\lambda} = \sqrt{2}\). Thus \[\begin{align*} P(2 - \sqrt{2} < X < 2 + \sqrt{2}) &= p_X(1) + p_X(2) + p_X(3) = e^{\lambda}\left(\lambda + \frac{\lambda^2}{2} + \frac{\lambda^3}{6} \right) \\ &= e^{2}\left(2 + 2+ \frac{8}{6} \right) = \frac{16}{3} e^{-2} = 0.722 \,. \end{align*}\]


  • Problem 3

The particular event in question happens 8 times per year in the three states, and thus the expected number of events in a two-year window is 16. The appropriate distribution in this case is the Poisson distribution. If the total number of observed events is denoted \(X\), then \(X \sim\) Poisson(\(\lambda\) = 16), and \(E[X] = V[X] = 16\).


  • Problem 4

(a) We utilize the shortcut formula: \[\begin{align*} E[X^2] = V[X] + (E[X])^2 = 2\sigma^2 - \frac{\pi}{2}\sigma^2 + \frac{\pi}{2}\sigma^2 = 2\sigma^2 \,. \end{align*}\]

(b) The first population moment is \(\mu_1' = E[X] = \sigma\sqrt{\pi/2}\) and the first sample moment is \(m_1' = (1/n)\sum_{i=1}^n X_i = \bar{X}\). We set these equal and determine that \[\begin{align*} \hat{\sigma}_{MoM} = \sqrt{\frac{2}{\pi}} \bar{X} \,. \end{align*}\]

(c) The bias is \(E[\hat{\theta}-\theta] = E[\hat{\theta}] - \theta\), or \[\begin{align*} B[\hat{\theta}_{MoM}] &= E\left[\sqrt{\frac{2}{\pi}} \bar{X}\right] - \sigma = \sqrt{\frac{2}{\pi}} E\left[\bar{X}\right] - \sigma = \sqrt{\frac{2}{\pi}} E\left[X\right] - \sigma = \sqrt{\frac{2}{\pi}} \sqrt{\frac{\pi}{2}} \sigma - \sigma = 0 \,. \end{align*}\]

(d) The variance is \[\begin{align*} V[\hat{\theta}_{MoM}] &= V\left[\sqrt{\frac{2}{\pi}} \bar{X}\right] = \frac{2}{\pi} V\left[\bar{X}\right] = \frac{2}{\pi} \frac{V\left[X\right]}{n} = \frac{2}{n\pi} \frac{(4-\pi)\sigma^2}{2} = \frac{(4-\pi)\sigma^2}{n\pi} \,. \end{align*}\]

(e) The second population moment is \(\mu_2' = E[X^2] = 2\sigma^2\) (from part a) and the second sample moment is \(m_2' = (1/n)\sum_{i=1}^n X_i^2 = \overline{X^2}\). We set these equal and determine that \[\begin{align*} \widehat{\sigma^2}_{MoM} = \frac{\overline{X^2}}{2} \,. \end{align*}\]


  • Problem 5

(a) We have that \(\mu_1' = \frac{1}{p}\) and \(m_1' = X\). It follows from moment equation \(\mu_1' = m_1'\) that \(\frac{1}{p} = X\), so \(\hat{p}_{MoM} = \frac{1}{X}\).

(b) We have that \(\mu_2' = \frac{1 - p}{p^2} + \frac{1}{p^2} = \frac{2 - p}{p^2}\) and \(m_2' = X^2\). It follow from the second moment equation \(\mu_2' = m_2'\) that \[\begin{align*} \frac{2-p}{p^2} & = X^2 \\ \Rightarrow ~~~ 2-p & = p^2 X^2 \\ \Rightarrow ~~~ p^2 X^2 +p -2 &= 0 \\ \Rightarrow ~~~ \hat{p}_{MoM} & = \frac{-1 + \sqrt{1 + 8 X^2}}{2X^2} \,. \end{align*}\]


  • Problem 6

(a) The expected value for a beta distribution is \(E[X] = \alpha/(\alpha+\beta)\), so, using the first sample moment, we get that \[\begin{align*} E[X] &= \bar{X} \\ \Rightarrow ~~~ \frac{\alpha}{\alpha+\beta} &= \bar{X} \\ \Rightarrow ~~~ \frac{\alpha+\beta}{\alpha} = 1 + \frac{\beta}{\alpha} &= \frac{1}{\bar{X}} \\ \Rightarrow ~~~ \frac{\beta}{\alpha} &= \frac{1}{\bar{X}}-1 \\ \Rightarrow ~~~ \hat{\beta}_{MoM} &= \alpha\left(\frac{1}{\bar{X}}-1\right) \,. \end{align*}\]

(b) There is no invariance property for the method-of-moments estimator, so the answer is no.


  • Problem 7

(a) The argument list is fine.

(b) We need to change mean(X) to sum(X) and lambda to n*lambda.

(c) The reference table tells us that a one-sided lower bound where \(E[U]\) increases with \(\lambda\) will have a value of \(q\) equal to \(1-\alpha\). So we plug in 0.95.

(d) Confidence intervals derived from discrete sampling distributions will have coverages \(\geq 1-\alpha\), so, here, we would say “greater than or equal to 95%.”


  • Problem 8

(a) The likelihood ratio test statistic is \[\begin{align*} \lambda_{LR} = \frac{\mbox{sup}_{\theta \in \Theta_o} \mathcal{L}(\theta \vert \mathbf{x})}{\mbox{sup}_{\theta \in \Theta} \mathcal{L}(\theta \vert \mathbf{x})} \end{align*}\] Here, that becomes \[\begin{align*} \lambda_{LR} &= \frac{\mathcal{L}(p_o \vert \mathbf{x})}{\mathcal{L}(\hat{p}_{MLE} \vert \mathbf{x})} = \frac{p_o^{\sum_{i=1}^n X_i}(1-p_o)^{n-\sum_{i=1}^n X_i}}{\bar{X}^{\sum_{i=1}^n X_i}(1-\bar{X})^{n-\sum_{i=1}^n X_i}} = \frac{p_o^U(1-p_o)^{n-U}}{\bar{X}^U(1-\bar{X})^{n-U}} \,. \end{align*}\]

(b) The test is a two-sided test, so we cannot proclaim it to be uniformly most powerful. However, it very well may be…we just cannot say with the information we have at hand. So: “maybe.”


  • Problem 9

(a) The maximum-likelihood estimate is \[\begin{align*} \ell(\lambda \vert x) &= x \log \lambda - \log x! - \lambda \\ \Rightarrow ~~~ \ell'(\lambda \vert x) &= \frac{x}{\lambda} - 1 = 0\\ \Rightarrow ~~~ \hat{\lambda}_{MLE} &= X \,. \end{align*}\] which here takes on the value \(x_{\rm obs}\).

(b) The likelihood-ratio test statistic is \[\begin{align*} \frac{\mbox{sup}_{\theta \in \Theta_o}\mathcal{L}(\theta \vert \mathbf{x})}{\mbox{sup}_{\theta \in \Theta}\mathcal{L}(\theta \vert \mathbf{x})} \,. \end{align*}\] Here, that means that for the numerator, we insert the Poisson pmf (remember: one datum) with \(\lambda_o\) plugged in, i.e., \[\begin{align*} \frac{\lambda_o^{x_{\rm obs}}}{x_{\rm obs}!} e^{-\lambda_o} \,, \end{align*}\] while for the denominator, we plug in the MLE for \(\lambda\), i.e., \[\begin{align*} \frac{x_{\rm obs}^{x_{\rm obs}}}{x_{\rm obs}!} e^{-x_{\rm obs}} \,. \end{align*}\] So the ratio is \[\begin{align*} \left( \frac{\lambda_o}{x_{\rm obs}} \right)^{x_{\rm obs}} e^{-(\lambda_o-x_{\rm obs})} \,. \end{align*}\]

(c) The expression is \[\begin{align*} W = -2 \log \lambda_{LR} = -2 x_{\rm obs} \log \left( \frac{\lambda_o}{x_{\rm obs}} \right) + 2 (\lambda_o-x_{\rm obs}) \,. \end{align*}\]

(d) \(W\) is sampled from a chi-square distribution for 1 degree of freedom.


  • Problem 10

(a) The factorized likelihood is \[\begin{align*} \mathcal{L}(\lambda \vert \mathbf{x}) = \prod_{i=1}^n \frac{\lambda^{x_i}}{x_i!}e^{-\lambda} = \left(\frac{1}{\prod_{i=1}^n x_i!}\right) \cdot \lambda^{\sum_{i=1}^n x_i} e^{-n\lambda} = h(\mathbf{x}) \cdot g(\lambda,\mathbf{x}) \,. \end{align*}\] The sufficient statistic is \(Y = \sum_{i=1}^n X_i\).

(b) The sum of \(n\) iid Poisson random variables is a Poisson random variable with parameter \(n\lambda\). The moment-generating function for a Poisson random variable is \(m_X(t) = \exp(\lambda(e^t-1))\), so the mgf for \(Y\) is \[\begin{align*} m_Y(t) = \prod_{i=1}^n m_{X_i}(t) = \left[ m_{X_i}(t) \right]^n = \exp(n\lambda(e^t-1)) \,. \end{align*}\] This is the mgf for a Poisson(\(n\lambda\)) distribution.

(c) Recall that in an LRT context, \(\Theta = \Theta_o \cup \Theta_a\); in other words, the null must contain all possible values of \(\theta\) that are not in the alternative. Hence: \(H_o : \theta \geq \theta_o\). This inequality does not actually change how the test is constructed, but does change how we interpret it: the true \(\alpha\) for this test will be less than or equal to the stated \(\alpha\).

(d) We are performing a lower-tail test, and we are on the “yes” line (since \(E[Y] = n\lambda\) increases with \(\lambda\)). From the reference table, that means \(y_{\rm RR}\) is equal to \(F_Y^{-1}(\alpha \vert n\lambda_o)\), which in code is qpois(0.05,n*lambda.o). (Recall that there are no discreteness corrections in rejection-region boundary computations.)


  • Problem 11

(a) The likelihood function for the sample is: \[\begin{align*} \mathcal{L}(\beta \vert \mathbf{x}) = \prod_{i=1}^{n} \frac{1}{\beta}e^{-x/\beta} = \frac{1}{\beta^n} e^{-\frac{\sum_{i=1}^{n}x_i}{\beta}} \,. \end{align*}\] Under the null \(H_0 : \beta = 1\), the likelihood function becomes \[\begin{align*} \mathcal{L}(\beta_0 \vert \mathbf{x}) = e^{-\sum_{i=1}^{n}x_i} = e^{-n\bar x} \,, \end{align*}\] while under the alternative, \[\begin{align*} \sup_{\beta > 0} \mathcal{L}(\beta \vert \mathbf{x}) &= \mathcal{L}(\hat{\beta}_{MLE} \vert \mathbf{x})\\ & = \frac{1}{\bar x^n} e^{-\frac{\sum_{i=1}^{n}x_i}{\bar x}} = \frac{1}{\bar x^n} e^{-n} \,. \end{align*}\] So the likelihood ratio test statistic is \[\begin{align*} \lambda_{LR} &= \frac{\mathcal{L}(\beta_o\vert \mathbf{x})}{\sup_{\beta > 0} \mathcal{L}(\beta \vert \mathbf{x})}\\ &= \frac{e^{-n\bar x}}{\frac{1}{\bar x^n} e^{-n}} \\ &= \bar x^ne^{-n(\bar x-1)} \,. \end{align*}\]

(b) Under the null hypothesis, the number of degrees of freedom is \(r_o = 0\), because \(\beta = 1\) is set to a constant. Under the alternative hypothesis, the number of degrees of freedom is \(r = 1\), because we have one free parameter: \(\beta\). Therefore, according to Wilks’ theorem, the degree of freedom of the \(\chi^2\) distribution is \(r - r_o = 1-0=1\). Under the large-\(n\) approximation, \(-2\log(\lambda_{LR}) \sim \chi^2(1)\). Therefore, the rejection region corresponds to: \[\begin{align*} -2\log(\lambda) &> \chi^2_{0.95, 1}\\ \Rightarrow ~~~ -2\log\left(\bar x^ne^{-n(\bar x-1)}\right) &> \chi^2_{0.95,1} = 3.84 \\ \Rightarrow ~~~ n\left(\log(\bar x) - \bar x+1\right) &< \frac{3.84}{2} \,. \end{align*}\]


  • Problem 12

(a) \(H_0: p = p_0 = 0.5\) and \(H_a: p \neq 0.5\)

(b) \(\Theta_0 = \{p_0\}\) and \(\Theta_a = \{p\, \vert\, p \in [0,1] ~~\text{and}~~ p \neq p_0\}\)

(c) \(r_0 = 0\) (\(p\) is fixed) and \(r = 1\)

(d) The likelihood ratio test statistic is \[\begin{align*} \lambda = \frac{\mathcal{L}(p_0 \vert x)}{\mathcal{L}(\hat{p}_{MLE} \vert x)} = \frac{\frac{1000!}{550!450!} 0.5^{550} (1-0.5)^{450} }{ \frac{1000!}{550!450!} 0.55^{550} (1-0.55)^{450}} = \frac{0.5^{1000}}{0.55^{550} \cdot 0.45^{450}} = 0.00668 \,, \end{align*}\] where we make use of the fact that \(\hat{p}_{MLE} = x/n = 0.55\).

(e) We have that \(W_{\rm obs} = -2 \log(\lambda_{LR}) = 10.017\). According to Wilk’s theorem, the \(p\)-value is \[\begin{align*} \int_{W_{\rm obs}}^\infty f_W(w) dw \,, \end{align*}\] for 1 degree of freedom, or 1 - pchisq(10.017,1) (= 0.00155).

(f) We have sufficient evidence to reject the null hypothesis and thus to conclude that the coin is not a fair one.


  • Problem 13

By inspection, \(X \sim\) Gamma(3,2/3). Thus \(E[X] = \alpha \beta = 2\) and \(V[X] = \alpha \beta^2 = 3 (2/3)^2 = 4/3\).


  • Problem 14

(a) \(E[X] = \alpha \beta\) and \(V[X] = \alpha \beta^2\), so \(V[X]/E[X] = \beta = 10/5 = 2\), and \(\alpha = 5/2 = 2.5\).

(b) \(\beta = 2\) and \(\alpha = 2.5\) \(\Rightarrow\) chi-square distribution (for 5 degrees of freedom).


  • Problem 15

We have that \[\begin{align*} E[X^{-1}] &= \int_0^\infty \frac1x f_X(x) dx = \int_0^\infty \frac1x \frac{x^{\nu/2-1}}{2^{\nu/2}} \frac{e^{-x/2}}{\Gamma(\nu/2)} dx \\ &= \int_0^\infty \frac{x^{\nu/2-2}}{2^{\nu/2}} \frac{e^{-x/2}}{\Gamma(\nu/2)} dx = \int_0^\infty \frac{x^{\nu/2-2}}{2^{\nu/2}} \frac{2^{-1}}{2^{-1}} \frac{e^{-x/2}}{\Gamma(\nu/2)} \frac{\Gamma(\nu/2-1)}{\Gamma(\nu/2-1)} dx \\ &= 2^{-1} \frac{\Gamma(\nu/2-1)}{\Gamma(\nu/2)} \int_0^\infty \frac{x^{\nu/2-2}}{2^{\nu/2-1}} \frac{e^{-x/2}}{\Gamma(\nu/2-1)} dx = 2^{-1} \frac{\Gamma(\nu/2-1)}{\Gamma(\nu/2)} \\ &= \frac12 \frac{\Gamma(\nu/2-1)}{(\nu/2-1)\Gamma(\nu/2-1)} = \frac{1}{2(\nu/2-1)} = \frac{1}{\nu-2} \,. \end{align*}\]


  • Problem 16

We have that \[\begin{align*} E[X] = \int_0^\infty x f_X(x) dx &= \int_0^\infty x \frac{\beta^\alpha}{\Gamma(\alpha)} \frac{1}{x^{\alpha+1}} e^{-\beta/x} dx \\ &= \int_0^\infty \frac{\beta^\alpha}{\Gamma(\alpha)} \frac{1}{x^{\alpha}} e^{-\beta/x} dx \\ &= \frac{\beta^{\alpha}}{\beta{\alpha-1}} \frac{\Gamma(\alpha-1)}{\Gamma(\alpha)} \int_0^\infty \frac{\beta^{\alpha-1}}{\Gamma(\alpha-1)} \frac{1}{x^{\alpha}} e^{-\beta/x} dx \\ &= \frac{\beta^{\alpha}}{\beta{\alpha-1}} \frac{\Gamma(\alpha-1)}{\Gamma(\alpha)} \cdot 1 \\ &= \beta \frac{\Gamma(\alpha-1)}{(\alpha-1)\Gamma(\alpha-1)} \\ &= \frac{\beta}{\alpha-1} \,. \end{align*}\]


  • Problem 17

(a) If \(\beta = 2\) and \(\alpha\) is a half-integer or an integer, then \(X\) is sampled from a “chi-square” distribution. (Note that because \(\alpha\) is a half-integer here, we cannot answer “Erlang” or “exponential.”)

(b) The gamma pdf with \(\alpha = 3/2\) is \[\begin{align*} f_X(x) = \frac{x^{1/2}}{\beta^{3/2}} \frac{e^{-x/\beta}}{\Gamma(3/2)} \,, \end{align*}\] so the likelihood function is \[\begin{align*} \mathcal{L}(\beta \vert \mathbf{x} = \prod_{i=1}^n \frac{x_i^{1/2}}{\beta^{3/2}} \frac{e^{-x_i/\beta}}{\Gamma(3/2)} = \frac{\sqrt{\prod_{i=1}^n x_i}}{[\Gamma(3/2)]^n} \cdot \frac{e^{-(\sum_{i=1}^n x_i)/\beta}}{\beta^{3/2}} = h(\mathbf{x}) \cdot g(\beta,\mathbf{x}) \,. \end{align*}\] We can read off of the \(g(\cdot)\) function that \(Y = \sum_{i=1}^n X_i\) is a sufficient statistic for \(\beta\).

(c) We start by computing \[\begin{align*} E[Y] = E\left[ \sum_{i=1}^n X_i \right] = \sum_{i=1}^n E[X_i] = nE[X] = n \alpha \beta = \frac{3}{2}n\beta \,. \end{align*}\] Thus \[\begin{align*} E\left[\frac{2Y}{3n}\right] = \beta \end{align*}\] and \[\begin{align*} \hat{\beta}_{MVUE} = \frac{2Y}{3n} = \frac{2}{3}\bar{X} \,. \end{align*}\]

(d) The first population moment is \(\mu_1' = E[X] = (3/2)\beta\) and the first sample moment is \(m_1' = (1/n)\sum_{i=1}^n X_i = \bar{X}\). We set these equal and find that \[\begin{align*} \hat{\beta}_{MoM} = \frac{2}{3}\bar{X} \,. \end{align*}\]

(e) Because the MoM is equivalent to the MVUE, we know immediately that the bias of the MoM is 0.


  • Problem 18

We have that \[\begin{align*} E[X^{1/2}] = \int_0^\infty x^{1/2} f_X(x) dx &= \int_0^\infty x^{1/2} \frac{x^{\alpha-1}}{\beta^\alpha}\frac{e^{-x/\beta}}{\Gamma(\alpha)} dx \\ &= \int_0^\infty \frac{x^{\alpha-1/2}}{\beta^\alpha}\frac{e^{-x/\beta}}{\Gamma(\alpha)} dx \\ &= \int_0^\infty \frac{x^{\alpha-1/2}}{\beta^\alpha}\frac{e^{-x/\beta}}{\Gamma(\alpha)} \frac{\beta^{\alpha+1/2}}{\beta^{\alpha+1/2}} \frac{\Gamma(\alpha+1/2)}{\Gamma(\alpha+1/2)} dx \\ &= \frac{\beta^{\alpha+1/2}}{\beta^{\alpha}} \frac{\Gamma(\alpha+1/2)}{\Gamma(\alpha)} \int_0^\infty \frac{x^{\alpha-1/2}}{\beta^{\alpha+1/2}}\frac{e^{-x/\beta}}{\Gamma(\alpha+1/2)} dx \\ &= \frac{\beta^{\alpha+1/2}}{\beta^{\alpha}} \frac{\Gamma(\alpha+1/2)}{\Gamma(\alpha)} \\ &= \sqrt{\beta} \frac{\Gamma(\alpha+1/2)}{\Gamma(\alpha)} \,. \end{align*}\] We note that in general, \[\begin{align*} E[X^k] = \beta^k \frac{\Gamma(\alpha+k)}{\Gamma(\alpha)} \,. \end{align*}\]


  • Problem 19

(a) The overdispersion parameter in a negative binomial regression is dubbed “Theta” and is thus 214,488.

(b) If the overdispersion parameter is \(\infty\), then Poisson regression is recovered. The value here is sufficiently large that we can say with confidence that there is no overdispersion. (Backing up this conclusion are the nearly identical results since when learning both regression models.)

(c) The answer is the Likelihood Ratio test. The statistic is the difference in the deviance values, which we assume under the null is chi-square distributed for the difference in the numbers of degree of freedom.

(d) The null hypothesis in the LRT is \(\beta_1 = 0\) and the alternative is \(\beta_1 \neq 0\). The test statistic is so large (\(\approx\) 165), especially considering that the expected value for a chi-square distribution is 1 for 1 degree of freedom, that we can safely conclude that \(\beta_1 \neq 0\).


  • Problem 20

(a) The moment-generating function for a Gamma(1,1) distribution is \[\begin{align*} m_X(t) = (1-\beta t)^{-\alpha} = (1-t)^{-1} \,. \end{align*}\] Thus the mgf for \(\bar{X}\) is \[\begin{align*} m_{\bar{X}}(t) = \prod_{i=1}^{100} m_X\left(\frac{t}{n}\right) = \left(1-\frac{t}{100}\right)^{-100} \,, \end{align*}\] which we recognize as a Gamma(100,1/100) distribution.

(b) The expected value of \(X\) is \(E[X] = 1 \cdot 1 = 1\), so the quantity we seek is \[\begin{align*} P(\bar{X} \geq E[X]) = 1 - P(\bar{X} < 1) = 1 - F_{\bar{X}}(1) \,, \end{align*}\] which rendered in R is

1 - pgamma(1,shape=100,scale=1/100)
## [1] 0.4867012

  • Problem 21

(a) The entropy is \[\begin{align*} E[-\log p_X(x)] &= E\left[-X \log \lambda + \lambda + \log X! \right] = - E[X] \log \lambda + \lambda + E[\log X!] \\ &= \lambda (1 - \log \lambda) + E[\log X!] \,. \end{align*}\]

(b) The probability-generating function is \[\begin{align*} E[z^X] &= \sum_{x=0}^\infty z^x \frac{\lambda^x}{x!} e^{-\lambda} = \sum_{x=0}^\infty \frac{(z\lambda)^x}{x!} e^{-\lambda} = \sum_{x=0}^\infty \frac{(z\lambda)^x}{x!} e^{-\lambda} \frac{e^{-z\lambda}}{e^{-z\lambda}} = \frac{e^{-\lambda}}{e^{-z\lambda}} \sum_{x=0}^\infty \frac{(z\lambda)^x}{x!} e^{-z\lambda} = \frac{e^{-\lambda}}{e^{-z\lambda}} = e^{(z-1)\lambda} \,. \end{align*}\]


  • Problem 22

(a) The functional form and domain specify: gamma (or Erlang).

(b) We can read off of \(e^{-2x} = e^{-x/(1/2)}\) that \(\beta = 1/2\), and we can read off of \(x^2 = x^{\alpha-1}\) that \(\alpha = 3\).

(c) The easy way: \(E[X] = \alpha\beta = 3/2\). The hard way: \[\begin{align*} E[X] = \int_0^\infty x f_X(x) dx &= \int_0^\infty 4x^3 e^{-2x} dx \,. \end{align*}\] We set \(u = 2x\) and \(du = 2dx\); the bounds of integration do not change. Thus \[\begin{align*} E[X] &= \int_0^\infty 4x^3 e^{-2x} dx \\ &= \int_0^\infty 4 \left(\frac{u}{2}\right)^3 e^{-u} \frac{du}{2} \\ &= \frac14 \int_0^\infty u^3 e^{-u} du \\ &= \frac14 \Gamma(4) = \frac14 3! = \frac64 = \frac32 \,. \end{align*}\]


  • Problem 23

(a) The first population moment is \(E[X] = \sqrt{2/\pi} \sigma\) and the first sample moment is \(\bar{X}\)…hence: \[\begin{align*} \hat{\sigma}_{MoM} = \sqrt{\frac{\pi}{2}} \bar{X} \,. \end{align*}\]

(b) The second population moment is \[\begin{align*} E[X^2] = V[X] + (E[X])^2 = \sigma^2\left(1-\frac{2}{\pi}\right) + \sigma^2\frac{2}{\pi} = \sigma^2 \,. \end{align*}\] The second sample moment is \(\overline{X^2}\). Hence: \[\begin{align*} \hat{\sigma^2}_{MoM} = \overline{X^2} \,. \end{align*}\]


  • Problem 24

(a) The likelihood is \(f_X(x \vert \theta)\) and thus \[\begin{align*} \ell(x \vert \theta) &= \log \theta + (\theta-1) \log x - \theta \log 3 \\ \Rightarrow ~~~ \ell'(x \vert \theta) &= \frac{1}{\theta} + (\log x - \log 3) \\ \Rightarrow ~~~ &= \frac{1}{\theta} + \log (x/3) = 0 \\ \Rightarrow ~~~ \hat{\theta}_{MLE} &= -\frac{1}{\log (X/3)} \,. \end{align*}\]

(b) The likelihood-ratio test statistic is \[\begin{align*} \lambda_{\rm LR} = \frac{ {\rm sup}_{\theta \in \Theta_o} \mathcal{L}(\theta \vert \mathbf{x}) }{ {\rm sup}_{\theta \in \Theta} \mathcal{L}(\theta \vert \mathbf{x}) } \,, \end{align*}\] where \(\Theta_o = \{ \theta : \theta = \theta_o \}\) and \(\Theta = \{ \theta : \theta > 0 \}\). Thus in the numerator we plug in \(\theta_o\) and in the denominator we plug in \(\hat{\theta}_{MLE}\): \[\begin{align*} \lambda_{\rm LR} &= \frac{ (\theta_o/3^{\theta_o}) X^{\theta_o-1} }{ (\hat{\theta}_{MLE}/3^{\hat{\theta}_{MLE}}) X^{\hat{\theta}_{MLE}-1} } = \frac{\theta_o}{\hat{\theta}_{MLE}} \left(\frac{X}{3}\right)^{\theta_o-\hat{\theta}_{MLE}} \,. \end{align*}\]

(c) The test statistic is \(W = -2\log \lambda_{\rm LR}\): \[\begin{align*} W &= -2 \log \frac{\theta_o}{\hat{\theta}_{MLE}} \left(\frac{X}{3}\right)^{\theta_o-\hat{\theta}_{MLE}} = -2 \left[ \log \frac{\theta_o}{\hat{\theta}_{MLE}} + (\theta_o-\hat{\theta}_{MLE}) \log \frac{X}{3} \right] \,. \end{align*}\]

(d) For the constrained model, \(r_o = 0\). (\(\theta\) is fixed.) For the unconstrained model, \(r = 1\). Thus \(\Delta r = r - r_o = 1\) is the number of degrees of freedom for the chi-square distribution that stat is sampled from under the null:

1 - pchisq(stat,1)

  • Problem 25

(a) The model to the right is a negative binomial regression model.

(b) The AIC is lower to the left.

(c) To determine whether the model to the left is a viable representation of the data-generating process, we assume that under the null hypothesis that it is, the residual deviance is sampled from a chi-square distribution for its associated number of degrees of freedom. Hence the \(p\)-value here is

1 - pchisq(41.94,28)

That’s the final answer…but in real-life we can take the additional step of computing the value: 0.044. We could decide to reject the null (and say that the Poisson model is not a good representation of the data-generating process), but given that the \(p\)-value is only approximate (because the residual deviance is only approximately chi-square-distributed under the null), and its value is \(\approx\) 0.05, this decision would be marginal at best.

Chapter 5

  • Problem 1

Here, the first population moment is \(\mu_1' = E[X] = \frac{3}{2} \theta\) and the first sample moment is \(m_1' = \frac{1}{n} \sum X_i = \bar{X}\). So the MoM estimator for \(\theta\), following from setting \(\mu_1' = m_1'\), is \(\hat{\theta}_{MoM} = \frac{2}{3} \bar{X}\).


  • Problem 2

We have that \[\begin{align*} P(X > a+b \vert X > b) = \frac{P(X > a+b \cap X > b)}{P(X > b)} = \frac{P(X > a+b)}{P(X > b)} \,, \end{align*}\] and \[\begin{align*} P(X > a+b) = \int_{a+b}^1 dx = 1-(a+b) ~~~ P(X > b) = \int_b^1 dx = 1-b \,. \end{align*}\] So \[\begin{align*} \frac{P(X > a+b)}{P(X > b)} = \frac{1 - (a+b)}{1-b} \,. \end{align*}\] The uniform distribution does not exhibit the memoryless property, as the ratio above does not depend just on \(a\). (If \(b\) cancelled out top and bottom, then the distribution would exhibit memorylessness.)


  • Problem 3

(a) It doesn’t matter when she arrives: \[\begin{align*} P(x_0 \leq X \leq x_0 + 10) = \int_{x_0}^{x_0 + 10} \frac{1}{70} dx = \frac{x_0 + 10}{70} - \frac{x_0}{70} = \frac{1}{7} \,. \end{align*}\]

(b) We have that \(Y =\) Binomial\((n=5,p=1/7)\), so \[\begin{align*} P(Y\geq 1) = 1 - P(Y = 0) = 1 - {5 \choose 0}\frac{1}{7}^0\left(1 - \frac{1}{7}\right) = 1 - \left( \frac{6}{7}\right)^5. \end{align*}\]


  • Problem 4

We have that \[\begin{align*} P(X \leq 2u | X \geq u) = \frac{P(X \leq 2u \cap X \geq u)}{P(X \geq u)} = \frac{\int_u^{2u}dx}{\int_u^{1}dx} = \frac{2u - u}{1 - u} = \frac{u}{1-u} \,. \end{align*}\]


  • Problem 5

We have that \[\begin{align*} P(X_1 < 2X_2 \vert X_2 < 1/2) = \frac{P(X_2 > X_1/2 \cap X_2 < 1/2)}{P(X_2 < 1/2)} \,. \end{align*}\] We can approach this geometrically. The denominator is, by inspection, 1/2, so \[\begin{align*} P(X_1 < 2X_2 \vert X_2 < 1/2) = 2P(X_2 > X_1/2 \cap X_2 < 1/2) \,. \end{align*}\] The remaining expression evaluates as the area of the triangle with vertices (0,0), (1,1/2), and (0,1/2), which is (1/2)(1)(1/2) = 1/4. Thus \[\begin{align*} P(X_1 < 2X_2 \vert X_2 < 1/2) = 2 \cdot 1/4 = 1/2 \,. \end{align*}\]


  • Problem 6

(a) Since \(\theta\) is a lower bound, the sufficient statistic is \(X_{(1)}\), the minimum observed datum, by inspection.

(b) The cdf \(F_{(1)}(x)\) is \[\begin{align*} F_{(1)}(x) = 1 - [1 - F_X(x)]^n \,. \end{align*}\] The cdf \(F_X(x)\) is \[\begin{align*} F_X(x) = - \int_{\theta}^x \frac{1}{\theta} dy = - \frac{1}{\theta} \int_{\theta}^x dy = - \frac{1}{\theta} (x - \theta) = 1 - x/\theta \,. \end{align*}\] Thus \[\begin{align*} F_{(1)}(x) = 1 - [1 - (1 - x/\theta)]^n = 1 - \left(\frac{x}{\theta}\right)^n \,, \end{align*}\]

(c) If the null is true, we cannot observe a value of \(X_{(1)}\) that is smaller than \(\theta_o\). So the “trivial rejection region” is \(X_{(1)} < \theta_o\). This is “trivial” because we can write it down via inspection (and it does not depend on \(\alpha\)).

(d) We can only reject the null if \(X_{(1)} > \theta_o\), so we have to “all the \(\alpha\)” on that side of \(\theta_o\): \(1 - \alpha\).

(e) We have that \[\begin{align*} 1 - \left(\frac{x_{RR}}{\theta_o}\right)^n &= 1 - \alpha ~~~ \Rightarrow ~~~ \left(\frac{x_{RR}}{\theta_o}\right)^n = \alpha ~~~ \Rightarrow ~~~ x_{RR} = \theta_o \alpha^{1/n} \,. \end{align*}\]


  • Problem 7

(a) By inspection, \(\hat{\theta}_{MLE} = X_{(n)}\).

(b) The cdf for \(X_{(n)}\) is \([F_X(x)]^n = (x/\theta)^{2n}\), and the pdf is thus \[\begin{align*} f_{(n)}(x) = \frac{d}{dx} F_{(n)}(x) = \frac{2n}{\theta^{2n}} x^{2n-1} \,. \end{align*}\] Thus \[\begin{align*} E[X_{(n)}] &= \int_0^\theta x f_{(n)}(x) dx = \frac{2n}{\theta^{2n}} \int_0^\theta x^{2n} dx = \frac{2n}{\theta^{2n}} \left. \frac{x^{2n+1}}{2n+1}\right|_0^{\theta} = \frac{2n}{2n+1}\theta \,. \end{align*}\]

(c) Since \[\begin{align*} E[X_{(n)}] &= \frac{2n}{2n+1}\theta \,, \end{align*}\] we have that \[\begin{align*} E\left[\frac{2n+1}{2n}X_{(n)}\right] &= \theta \end{align*}\] and thus \[\begin{align*} \hat{\theta}_{MVUE} = \frac{2n+1}{2n}X_{(n)} \,. \end{align*}\]


  • Problem 8

(a) As \(\theta\) is a lower bound, \(X_{(1)}\) is a sufficient statistic.

(b) The MLE is the sufficient statistic in (a): \(\hat{\theta}_{MLE} = X_{(1)}\).

(c) The pdf for the minimum datum is \[\begin{align*} f_{(1)}(x) = n f_X(x) [1-F_X(x)]^{n-1} = n e^{-(x-\theta)} \left(e^{-(x-\theta)}\right)^{n-1} = n e^{-n(x-\theta)} \,. \end{align*}\]

(d) The expected value of \(X_{(1)}\) is \[\begin{align*} E[X_{(1)}] = \int_{\theta}^\infty x n e^{-n(x-\theta)} dx = n \int_{\theta}^\infty x n e^{-n(x-\theta)} dx \,. \end{align*}\] Let \(u = n(x-\theta)\). Then \(du = n dx\), and if \(x = \theta\), \(u = 0\), and if \(x = \infty\), \(u = \infty\). Thus \[\begin{align*} E[X_{(1)}] &= n \int_0^\infty \left(\frac{u}{n}+\theta\right) e^{-u} \frac{du}{n} = \int_0^\infty \frac{u}{n} e^{-u} du + \int_0^\infty \theta e^{-u} du = \frac{1}{n} \int_0^\infty u e^{-u} du + \theta \int_0^\infty e^{-u} du \\ &= \frac{1}{n} \Gamma(2) + \theta \Gamma(1) = \frac{1}{n} \cdot 1! + \theta \cdot 0! = \theta + \frac{1}{n} \,. \end{align*}\] Hence \(\hat{\theta}_{MVUE} = X_{(1)} - 1/n\).


  • Problem 9

(a) Since \(X_1\) and \(X_2\) are independent, \(P(X_1 > 1/2 \vert X_2 < 1/2) = P(X_1 > 1/2) = 1/2\).

(b) We have that \[\begin{align*} P\left(X_1 > \frac12 \vert X_1 < \frac34\right) = \frac{P(X_1 > 1/2 \cap X_1 < 3/4)}{P(X_1 < 3/4)} = \frac{P(1/2 < X_1 < 3/4)}{P(X_1 < 3/4)} = \frac{0.25}{0.75} = \frac13 \,. \end{align*}\]

(c) \(X_1 < 3X_2\) is equivalent to \(X_2 > \frac13 X_1\), i.e., \(X_2\) lies above the line with intercept 0 and slope 1/3. The area of this region is 1 minus the area of the triangle with vertices (0,0), (1,0), and (1,1/3) or \(1 - 1/6\) = 5/6.

(d) We have that \[\begin{align*} P\left(X_2 < X_1 \vert X_2 < \frac12\right) = \frac{P(X_2 < X_1 \cap X_2 < 1/2)}{P(X_1 < 1/2)} = 2P(X_2 < X_1 \cap X_2 < 1/2) \,. \end{align*}\] The probability is the area of the polygon with vertices (0,0), (1,0), (1,1/2), and (1/2,1/2), or 3/8. So \(P\left(X_2 < X_1 \vert X_2 < \frac12\right) = 6/8\) or 3/4.


  • Problem 10

(a) The expected value indicates that we are on the “yes” line of the confidence interval reference table, hence we want to solve \[\begin{align*} 1 - e^{-n(y_{\rm obs}-\theta)} - (1-\alpha) = 0 \end{align*}\] for \(\theta\): \[\begin{align*} e^{-n(y_{\rm obs}-\theta)} &= \alpha \\ \Rightarrow ~~~ -n(y_{\rm obs}-\theta) &= \log(\alpha) \\ \Rightarrow ~~~ \theta - y_{\rm obs} &= \frac{1}{n}\log(\alpha) \\ \Rightarrow ~~~ \hat{\theta}_L &= y_{\rm obs} + \frac{1}{n}\log(\alpha) \,. \end{align*}\]

(b) Note that although this is a two-tailed test, it is impossible to sample a statistic value less than \(\theta\), so we derive the rejection region boundary as if we are performing an upper-tail test. We are on the “yes” line of the hypothesis test reference table, hence we want to solve \[\begin{align*} 1 - e^{-n(y_{\rm RR}-\theta_o)} - (1-\alpha) = 0 \end{align*}\] for \(y_{\rm RR}\): \[\begin{align*} e^{-n(y_{\rm RR}-\theta_o)} &= \alpha ~~~ \Rightarrow ~~~ -n(y_{\rm RR}-\theta_o) &= \log(\alpha) ~~~ \Rightarrow ~~~ y_{\rm RR} - \theta_o = -\frac{1}{n}\log(\alpha) ~~~ \Rightarrow ~~~ y_{\rm RR} = \theta_o - \frac{1}{n}\log(\alpha) \,. \end{align*}\]


  • Problem 11

(a) We utilize the Law of the Unconscious Statistician to derive the moment-generating function: \[\begin{align*} m_X(t) = E[e^{tx}] = k \sum_x e^{tx}p_X(x) = \frac{1}{2}\left[ e^{t} + e^{2t}\right] = \frac{e}{2}\left[ 1 + e^{t}\right] \,. \end{align*}\]

(b) To derive the variance, we take the first two derivatives of \(m_X(t)\), set \(t\) to zero in each, and apply the shortcut formula: \[\begin{align*} E[X] &= \frac{d m_{X}(t)}{dt}\bigg|_0 = \frac{e^t}{2} e^t + \frac{e^t}{2}\left[ 1 + e^{t}\right]\bigg|_0 = \frac{1}{2}\cdot 1 + \frac{1}{2}(1+1) = \frac{3}{2} \\ E[X^2] &= \frac{d^2 m_{X}(t)}{dt^2}\bigg|_0 = \frac{2e^{2t}}{2} + \frac{e^te^t}{2} + e^t\left[ 1 + e^{t}\right] \bigg|_0 = 1 + \frac{1}{2} + 1= \frac{5}{2} \\ \Rightarrow ~~~ V[X] &= E[X^2] - E[X]^2 = \frac{5}{2}- \left(\frac{3}{2}\right)^2 = \frac{1}{4} \,. \end{align*}\]


  • Problem 12

(a) We utilize a general transformation here: \[\begin{align*} F_U(u) = P(U \leq u) = P(\sqrt{X} \leq u) = P(X \leq u^2) = \int_0^{u^2} f(x) dx = u^2 \,. \end{align*}\] Thus \[\begin{align*} f_U(u) = \frac{d}{du}F_U(u) = 2u \end{align*}\] for \(u \in [0,1]\).

(b) \(f_U(u) = 2u\) for \(0 \leq u \leq 1\) is a beta distribution: \(U \sim\) Beta(2,1).


  • Problem 13

(a) If \(x = 0\), then \(u = x^2 = 0\); if \(x = 1\), then \(u = x^2 = 1\). Also, the function is one-to-one, so we know that \(u\) does not stray outside these endpoints. Thus \(u \in [0,1]\).

(b) We utilize a general transformation to determine that \[\begin{align*} F_U(u) = P(U \leq u) = P(X^2 \leq u) = P(X \leq \sqrt{u}) = \int_0^{\sqrt{u}} 1 dx = \sqrt{u} \,. \end{align*}\]

(c) The probability density function is the derivative of the cumulative distribution function: \[\begin{align*} f_U(u) = \frac{d}{du} F_U(u) = \frac{1}{2} u^{-1/2} \,. \end{align*}\]

(d) The expected value is \[\begin{align*} E[U] = \int_0^1 u f_U(u) du = \frac{1}{2} \int_0^1 u^{1/2} du = \frac12 \frac23 \left. u^{3/2} \right|_0^1 = \frac13 \,. \end{align*}\]


  • Problem 14

(a) A probability density function for a continuous distribution is the derivative of its cumulative distribution function, so \[\begin{align*} f_X(x) = \frac{d}{dx} F_X(x) &= \frac{d}{dx} c\left(1-e^{-x/\theta}\right) = -ce^{-x/\theta}\left(-\frac{1}{\theta}\right) = \frac{c}{\theta}e^{-x/\theta} \,. \end{align*}\] This is a truncated exponential distribution.

(b) We sample a single datum, so our statistic is \(X\). We are constructing an upper-tail test where \(E[X]\) decreases with \(c\) (as \(c \rightarrow \infty\), \(-\theta\log(1-1/c) \rightarrow 0\), from the right). So we are on the “no” line of the hypothesis test reference table: \(q = \alpha\). \[\begin{align*} c_o\left(1-e^{-x_{\rm RR}/\theta}\right) - \alpha &= 0 \\ \Rightarrow ~~~ \left(1-e^{-x_{\rm RR}/\theta}\right) &= \frac{\alpha}{c_o} \\ \Rightarrow ~~~ e^{-x_{\rm RR}/\theta} &= 1-\frac{\alpha}{c_o} \\ \Rightarrow ~~~ -\frac{x_{\rm RR}}{\theta} &= \log\left(1-\frac{\alpha}{c_o}\right) \\ \Rightarrow ~~~ x_{\rm RR} &= -\theta\log\left(1-\frac{\alpha}{c_o}\right) \,. \end{align*}\]

(c) The power can be (essentially) read directly off of the hypothesis test reference table: \[\begin{align*} power(\theta) = F_Y(y_{\rm RR} \vert \theta) ~~~ \Rightarrow ~~~ power(c) = c\left(1 - e^{-x_{\rm RR}/\theta}\right) \,. \end{align*}\] There is no “trivial” rejection here; that could only arise if \(c < c_o\).

(d) If we observed a value of \(X\) in the range \((-\theta\log(1-1/c_o),\infty)\) (or \((x_c,\infty)\)), we would trivially reject the null: it is impossible to sample a value of \(X\) in this range if the null is correct.

(e) \(x_c = -\theta \log(1-1/c)\) must be larger than the datum with the largest value, so the sufficient statistic is \(X_{(n)}\).


  • Problem 15

(a) The maximum likelihood estimate is the sufficient statistic \(X\).

(b) We have that \[\begin{align*} E[X] &= \int_0^\theta x \frac{2}{\theta} \left( 1 - \frac{x}{\theta} \right) dx = \frac{1}{\theta} \left. x^2 \right|_0^\theta - \frac{1}{\theta^2} \left. \frac{2x^3}{3} \right|_0^\theta = \frac{1}{\theta} \theta^2 - \frac{1}{\theta^2} \frac{2\theta^3}{3} = \theta - \frac23 \theta = \frac13 \theta \,. \end{align*}\]

(c) The bias is \(E[\hat{\theta}_{MLE}-\theta] = E[X] - \theta = -2\theta/3\).

(d) Since \(E[X] = \theta/3\), \(E[3X] = \theta\)…and thus \(\hat{\theta}_{MVUE} = 3X\).

(e) One can compute this using the MLE result, or using the MVUE result: \[\begin{align*} {\rm MLE}&: (B[\hat{\theta}_{MLE}])^2 + V[\hat{\theta}_{MLE}] = \frac49\theta^2 + \frac{1}{18}\theta^2 = \frac12 \theta^2 \\ {\rm MVUE}&: (B[\hat{\theta}_{MVUE}])^2 + V[\hat{\theta}_{MVUE}] = 0 + V[3X] = 9V[X] = \frac{9}{18}\theta^2 = \frac12 \theta^2 \,. \end{align*}\]

Chapter 6

  • Problem 1

(a) We have that \[\begin{align*} \int_0^1 \left( \int_0^1 dx_2 k (x_1 + x_2^2) \right) dx_1 &= 1 = k \left[ \int_0^1 x_1 \left( \int_0^1 dx_2 \right) dx_1 + \int_0^1 \left( \int_0^1 x_2^2 dx_2 \right) dx_1 \right] \\ &= k \left[ \int_0^1 x_1 dx_1 + \int_0^1 \frac13 dx_1 \right] = k \left[ \left. \frac{x_1^2}{2} \right|_0^1 + \left. \frac{x_1}{3} \right|_0^1 \right] \\ &= k \left( \frac{1}{2} + \frac{1}{3} \right) = k \frac{5}{6} \,. \end{align*}\] Thus \(k = 6/5\).

(b) We have that \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) = \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_2}(x_2)} = \frac{k (x_1 + x_2^2)}{f_{X_2}(x_2)} \,. \end{align*}\] So we need to compute the marginal density: \[\begin{align*} f_{X_2}(x_2) = k \int_0^1 dx_1 (x_1+x_2^2) = k \left[ \int_0^1 x_1 dx_1 + \int_0^1 x_2^2 dx_1 \right] = k \left[ \left. \frac{x_1^2}{2} \right|_0^1 + x_2^2 (\left. x_1\right|_0^1) \right] = k \left( \frac{1}{2} + x_2^2 \right) \,. \end{align*}\] Thus \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) = \frac{k (x_1 + x_2^2)}{k ( \frac{1}{2} + x_2^2 )} = \frac{2x_1 + 2x_2^2}{1 + 2x_2^2} \,. \end{align*}\] This may initially appear strange (in that if \(x_1 > 1/2\), \(f_{X_1 \vert X_2}(x_1 \vert x_2) > 1\)), but we simply need to remind ourselves that \(f_{X_1 \vert X_2}(x_1 \vert x_2)\) is a conditional probability density function, not a probability itself.


  • Problem 2

(a) Cov(\(X_1,X_2\)) = \(E[X_1X_2] - E[X_1]E[X_2]\) = \(1 \cdot 1 \cdot 0.1 - (1 \cdot 0.4 + 1 \cdot 0.1)^2 = 0.1 - 0.25 = -0.15\).

(b) \(\rho\) = Cov(\(X_1,X_2\))/(\(\sigma_1\sigma_2\)), where \[\begin{align*} \sigma_1 = \sqrt{E[X_1^2] - (E[X_1])^2} = \sqrt{0.5 - (0.5)^2} = \sqrt{0.25} = 0.5 = \sigma_2 \,. \end{align*}\] So \(\rho = -0.15/0.5/0.5 = -0.15/0.25 = -0.6\).

(c) \(E[X_1 \vert X_2 < 1]\) is equivalent to \(E[X_1 \vert X_2 = 0]\), i.e., the expected value for data drawn from the first row of the given table. \[\begin{align*} E[X_1 \vert X_2 = 0] &= \sum_{x_1=0}^1 x_1 p(x_1 \vert x_2=0) = \sum_{x_1=0}^1 x_1 \frac{p(x_1,x_2=0)}{p(p_2{x_2}=0)} \\ &= \frac{0 \cdot p(x_1=0,x_2=0) + 1 \cdot p(x_1=1,x_2=0)}{p(x_1=0,x_2=0)+p(x_1=1,x_2=0)} = \frac{0.4}{0.5} = 0.8 \,. \end{align*}\] This answer could be reasoned out by inspecting the table.

(d) We have that \[\begin{align*} V[X_2 \vert X_1=1] &= E[X_2^2 \vert X_1=1] - (E[X_2\vert X_1=1])^2 \\ &= \sum_{x_2=0}^1 x_2^2 p(x_2 \vert x_1=1) - \left[\sum_{x_2=0}^1 x_2 p(x_2 \vert x_1=1)\right]^2 \\ &= \sum_{x_2=0}^1 x_2^2 \frac{p(x_1=1,x_2)}{p(p_1{x_1=1})} - \left[\sum_{x_2=0}^1 x_2 \frac{P(x_1=1,x_2)}{p(p_1{x_1=1})}\right]^2 \\ &= 1 \cdot \frac{p(x_1=1,x_2=1)}{p(x_1=1,x_2=0)+p(x_1=1,x_2=1)} - &~~~~~\left[1 \cdot \frac{p(x_1=1,x_2=1)}{p(x_1=1,x_2=0)+p(x_1=1,x_2=1)}\right]^2 \\ &= \frac{0.1}{0.5} - \left(\frac{0.1}{0.5}\right)^2 = 0.2 - 0.04 = 0.16 \,. \end{align*}\] This answer could also be reasoned out by inspecting the table.


  • Problem 3

(a) We have that \(X \vert p\) \(\sim\) Bin(\(n,p\)) and that \(p \sim\) Uniform(0,0.1). The expected value of \(X \vert p\) is \(np\) and the expected value of \(p\) is \((0+0.1)/2 = 0.05\). Thus \[\begin{align*} E[X] = E[E[X \vert p]] = E[np] = nE[p] = 0.05n \,. \end{align*}\]

(b) We have that \[\begin{align*} V[X] &= E[V[X \vert p]] + V[E[X \vert p]] = E[np(1-p)] + V[np] = n(E[p] - E[p^2]) + n^2V[p] \\ &= n[E[p] - (V[p]+E[p]^2)] + n^2V[p] = n(n-1)V[p] + nE[p] - n(E[p])^2 \,. \end{align*}\]


  • Problem 4

(a) The area of integration lies between the \(x_1\) axis, the \(x_2\) axis, and the line \(x_2 = 1-x_1\), in the first quadrant. \[\begin{align*} 1 &= \int_0^1 \int_0^{1-x_1} k x_1^2 x_2 dx_2 dx_1 = k \int_0^1 x_1^2 \int_0^{1-x_1} x_2 dx_2 dx_1 = k \int_0^1 x_1^2 \left( \left. \frac{x_2^2}{2}\right|_0^{1-x_1}\right) dx_1 \\ &= \frac{k}{2} \int_0^1 x_1^2 (1-x_1)^2 dx_1 = \frac{k}{2} B(3,3) = \frac{k \Gamma(3) \Gamma(3)}{2 \Gamma(6)} = \frac{4k}{240} \,. \end{align*}\] So \(k = 60\).

(b) \(P(X_1 > 0.25 \vert X_2 = 0.5) = \int_{0.25}^{0.5} f(x_1 \vert x_2=0.5)dx_1\). \(f_{X_1 \vert X_2}(x_1 \vert x_2) = f_{X_1,X_2}(x_1,x_2)/f_{X_2}(x_2)\), and \[\begin{align*} f_{X_2}(x_2) = \int_0^{1-x_2} 60 x_1^2 x_2 dx_1 = 60 x_2 \left( \left.\frac{x_1^3}{3}\right|_0^{1-x_2}\right) = 20x_2(1-x_2)^3 \,. \end{align*}\] So: \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) = \frac{60x_1^2y_2}{20x_2(1-y_2)^3} = \frac{3x_1^2}{(1-x_2)^3} \,. \end{align*}\] Plugging in \(x_2 = 0.5\), we get \(24x_1^2\). Finally: \[\begin{align*} \int_{1/4}^{1/2} 24 x_1^2 dx_1 = \left.8 x_1^3\right|_{1/4}^{1/2} = 7/8 \,. \end{align*}\]

(c) The region over which \(f_{X_1 \vert X_2}(x_1 \vert x_2)\) is non-zero is not rectangular: \(X_1\) and \(X_2\) are dependent random variables.


  • Problem 5

(a) We sum over rows:

\(x_2\) 0 1 2
\(p_{x_2}(x_2)\) 0.4 0.4 0.2

(b) We have that

\(x_2\) \((-\infty,0)\) [0,1) [1,2)
\(F_{X_2}(x_2)\) 0 0.4 0.8

(c) We have that \[\begin{align*} E[X] &= \sum_x x p_X(x) = 0 \cdot 0.4 + 1 \cdot 0.4 + 2 \cdot 0.2 = 0.8 \\ E[Y^2] &= \sum_x x^2 p_X(x) = 0^2 \cdot 0.4 + 1^2 \cdot 0.4 + 2^2 \cdot 0.2 = 1.2 \\ \end{align*}\] so \(V[X] = E[X^2] - (E[X])^2 = 0.56\) and \(\sigma = \sqrt{0.56} = 0.748\).


  • Problem 6

The region of integration lies between the \(x\) axis, the \(y\) axis, and the line \(x_2 = 1-x_1\), in the first quadrant.

(a) \({\rm Cov}(X_1,X_2) = E[X_1X_2] - E[X_1]E[X_2]\)…so we need to compute \(E[X_1X_2]\): \[\begin{align*} E[X_1X_2] &= \int_0^1 \int_0^{1-x_1} x_1 x_2 60 x_1^2 x_2 dx_2 dx_1 = 60 \int_0^1 x_1^3 \int_0^{1-x_1} x_2^2 dx_2 dx_1 \\ &= 60 \int_0^1 x_1^3 \frac{(1-x_1)^3}{3} dx_1 = 20 B(4,4) = 20 \frac{3! 3!}{7!} = 1/7 \,. \end{align*}\] Hence Cov(\(X_1,X_2\)) = 1/7 - 1/2(1/3) = \(-1/42\).

(b) We have that \[\begin{align*} {\rm Corr}(X_1,X_2) = \frac{{\rm Cov}(X_1,X_2)}{\sigma_{X_1}\sigma_{X_2}} = \frac{-1/42}{\sqrt{2/7-1/4} \sqrt{1/7-1/9}} = \cdots = -1/\sqrt{2} \,. \end{align*}\]

(c) \(V[X_1-2X_2] = V[X_1] + 4V[X_2] - 2 \cdot 2 \cdot {\rm Cov}(X_1,X_2) = 1/28 + 8/63 + 4/42 = \cdots = 65/252\).


  • Problem 7

Let \(N\) be the number of laid egges: \(N \sim\) Poi(\(\lambda\)), and \(E[N] = V[N] = \lambda\). Let \(X\) be the number of hatched eggs: \(X \vert N \sim\) Bin(\(N,p\)), and \(E[X \vert N] = Np\) and \(V[X \vert N] = Np(1-p)\).

(a) \(E[X] = E[E[X\vert N]] = E[Np] = pE[N] = \lambda p\).

(b) We have that \[\begin{align*} V[X] &= V[E[X\vert N]] + E[V[X \vert N]] = V[Np] + E[Np(1-p)] \\ &= p^2V[N] + p(1-p)E[N] = \lambda p^2 + \lambda p - \lambda p^2 = \lambda p \,. \end{align*}\]


  • Problem 8

(a) \(X_1\), \(X_2\) are independent \(\Rightarrow f_{X_1,X_2}(x_1, x_2) = f_{X_1}(x_1) f_{X_2}(x_2) = \left[ k_1 x_1 e^{-x_1/2}\right]\left[k_2x_2(1-x_2) \right]\). \(X_1 \sim \text{Gamma}(2,2) \left[ \text{or } \chi^2_4\right]\), and \(X_2 \sim \text{Beta}(2,2)\), thus \[\begin{align*} k = k_1 k_2 = \frac{1}{\beta_1^{\alpha_1}\Gamma(\alpha_1)}\underbrace{\frac{\Gamma(\alpha_2 + \beta_2)}{\Gamma(\alpha_2)\Gamma(\beta_2)}}_{1/B(\alpha_2,\beta_2)} = \frac{1}{2^21!}\frac{3!}{1!1!} = \frac{3}{2} \,. \end{align*}\]

(b) We have that \[\begin{align*} V[X_1 - X_2] &= V[X_1] + V[X_2] = \underbrace{\alpha_1 \beta_1^2}_{\text{Gamma}} + \underbrace{\alpha\beta/\left[(\alpha_2 + \beta_2)^2(\alpha_2 + \beta_2 + 1)\right]}_{\text{Beta}}\\ &= 8 + 4/(16 \cdot 5) = 8 + \frac{1}{20} = \frac{161}{20} \,. \end{align*}\]


  • Problem 9

(a) \(X_1\) and \(X_2\) are uniformly distributed. In a plane described by \(X_1\), \(X_2\), the region where \(f_{X_1,X_2}(x_1, x_2)>0\) corresponds to a rectangular trapezoid, with long base going from the point with coordinate \((0,0)\) to the point with coordinate \((2,0)\), and short base going from the point with coordinate \((0,1)\) to the point with coordinate \((1,1)\). Therefore the segment connecting the two points \((1,1)-(2,0)\) corresponds to the line \(X_2 = -X_1 + 2\), or \(X_1 = 2 - X_2\). Thus \[\begin{align*} f_{X_1,X_2}(x_1, x_2) = \frac{1}{\text{area of the region for which }f_{X_1,X_2}(x_1, x_2)>0 }= \frac{1}{3/2} = \frac{2}{3} \,. \end{align*}\]

(b) \(f_{X_2}(x_2) = \int_{x_1 = 0}^{x_1 = 2-x_2} k dx_1 = k \left[ x_1|_0^{2-x_2}\right] = k(2-x_2)\),i if \(0\leq x_2 \leq 1\). Therefore, \(f_{X_2}(x_2) = k(2-x_2)\) for \(x_2 \in [0,1]\).

(c) \(f_{X_1 | X_2}(x_1 | x_2) = f_{X_1,X_2}(x_1, x_2)/f_{X_2}(x_2) = 1/(2-x_2)\), for \(x_1 \in [0,2-x_2]\) and \(x_2 \in [0,1]\).

(d) We have that \[\begin{align*} E[X_1] =& \int_0^1 \int_{x_1 = 0}^{x_1 = 2-x_2} x_1 k dx_1 dx_2 = k \int_0^1 \left[\frac{x_1^2}{2}\bigg|_{0}^{2-x_2} \right] dx_2 \\ =& k \int_0^1 \frac{1}{2} (2-x_2)^2 dx_2 = \frac{k}{2} \int_0^1 (4 - 4x_2 + x_2^2) dx_2\\ =& \frac{k}{2} \left[ 4x_2\bigg|_0^1 - 2x_2^2\bigg|_0^1 + \frac{x_2^3}{3}\bigg|_0^1\right] = \frac{k}{2}\left(4 -2 +\frac{1}{3}\right) = \frac{7}{6}k \,. \end{align*}\]

(e) They are dependent since the region over which \(f_{X_1,X_2}(x_1, x_2) >0\) is non-rectangular.


  • Problem 10

(a) \(p \sim {\rm Beta}(2,2)\), and \(X|p \sim {\rm Bin}(5,p)\), thus \[\begin{align*} E\left[ E\left[ X|p\right] \right] = E[5p] = 5E[p] = 5\frac{\alpha}{\alpha + \beta} = 2.5 \,. \end{align*}\]

(b) We have that \[\begin{align*} V[X] &= V\left[ E\left[ X|p\right] \right] + E\left[ V\left[ X|p\right] \right] = V[5p] + E[5p(1-p)] = 25V[p] + 5\left[E[p] - E[p^2] \right]\\ &= 25V[p] + 5E[p] -5\left[ V[p] + (E[p])^2 \right] = 20V[p] + 5E[p] -5(E[p])^2\\ &= 20 \frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} + 5\frac{\alpha}{\alpha + \beta} - 5\left(\frac{\alpha}{\alpha + \beta}\right)^2\\ &= 20 \frac{4}{16 \cdot 5} + 2.5 -1.25 = 1 + 2.5 - 1.25 = 2.25 \,. \end{align*}\]


  • Problem 11

The region where \(f_{X_1,X_2}(x_1, x_2)>0\) can be represented by a triangle with vertices (0,0), (1,0), and (1,1). Thus: \[\begin{align*} E[X_1 X_2] &= \int_0^1 \left( \int_0^{x_1} x_1 x_2 (3 x_1) dx_2 \right) dx_1 = \int_0^1 3x_1^2 \left( \int_0^{x_1} x_2 dx_2\right) dx_1 \\ &= \int_0^1 3x_1^2 \frac{x_1^2}{2} dx_1 = \frac{3}{2} \int_0^1x_1^4 dx_1 = \frac{3}{10} x_1^5\bigg|_0^1 = \frac{3}{10} \,. \end{align*}\] Therefore \[\begin{align*} \rho = \frac{{\rm Cov}(X_1,X_2)}{\sqrt{V[X_1] V[X_2]}} = \frac{E[X_1X_2] - E[X_1] E[X_2]}{\sqrt{V[X_1] V[X_2]}} = \frac{\frac{3}{10} - \frac{9}{32}}{\sqrt{\frac{3}{80}\frac{19}{320}}} = 0.397 \,. \end{align*}\]


  • Problem 12

The region of integration lies in the first quadrant, below the line \(x_2 = x_1\).

(a) We have that \[\begin{align*} \int_0^\infty \left[ \int_0^{x_1} k e^{-x_1} dx_2 \right] dx_1 = k \int_0^\infty e^{-x_1} \int_0^{x_1} dx_2 dx_1 = k \int_0^\infty x_1 e^{-x_1} dx_1 = k \Gamma(2) = 1 \Rightarrow k = 1 \,. \end{align*}\]

(b) We have that \[\begin{align*} P(X_2 < 1) &= \int_0^1 \left[ \int_{x_2}^\infty e^{-x_1} dx_1 \right] dx_2 = \int_0^1 \left(-\left.e^{-x_1}\right|_{x_2}^\infty \right) dx_2 \\ &= \int_0^1 e^{-x_2} dx_2 = -\left.e^{-x_2}\right|_0^1 = -(e^{-1}-1) = 1-e^{-1} \,. \end{align*}\]

(c) The marginal distribution is \[\begin{align*} f_{X_2}(x_2) = \int_{x_2}^\infty e^{-x_1} dx_1 = -\left.e^{-x_1}\right|_{x_2}^\infty = -(0-e^{-x_2}) = e^{-x_2} \,, \end{align*}\] or, in full, \[\begin{align*} f_{X_2}(x_2) = \left\{ \begin{array}{cl} e^{-x_2} & x_2 \geq 0 \\ 0 & \mbox{otherwise} \end{array} \right. \,. \end{align*}\]

(d) The conditional pdf is \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) = \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_2}(x_2)} = \frac{e^{-x_1}}{e^{-x_2}} = e^{x_2-x_1} \,, \end{align*}\] or, in full, \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) = \left\{ \begin{array}{cl} e^{x_2-x_1} & x_1,x_2 \geq 0 \\ 0 & \mbox{otherwise} \end{array} \right. \,. \end{align*}\]

(e) We have that \[\begin{align*} P(X_1 > 2 \vert X_1 = 1) &= \int_2^\infty f_{X_1 \vert X_2=1}(x_1 \vert x_2=1) dx_1 = \int_2^\infty e^{1-x_1} dx_1 \\ &= e^1 \left(-\left.e^{-x_1}\right|_2^\infty\right) = e^1(-(0-e^{-2})) = e^{-1} \,. \end{align*}\]


  • Problem 13

We are given that \(X|\lambda \sim\) Poisson\((\lambda)\), and \(\lambda \sim\) NegBinom\((4,1/2)\). Thus \(E[X|\lambda] = \lambda\) and \(E[\lambda] = \frac{r(1-p)}{p} = 4\), and \(V[X|\lambda] = \lambda\) and \(V[\lambda] = \frac{r(1-p)}{p^2} = \frac{2}{1/4} = 8\).

(a) \(E[X] = E\left[ E\left[ X|\lambda\right] \right] = E[\lambda] = 4\).

(b) \(V[X] = E\left[ V\left[ X|\lambda\right] \right] + V\left[ E\left[ X|\lambda\right] \right] = E[\lambda] + V[\lambda] = 4 + 8 = 12\).


  • Problem 14

The region where \(f_{X_1,X_2}(x_1, x_2)\) is positive can be described as the union of two rectangular triangle. The first one with vertexes \((-2,0) - (-2, 2) - (0,0)\). The second one with vertexes \((2,0) - (2, 2) - (0,0)\).

(a) The region is not rectangular: \(X_1, X_2\) are not independent.

(b) Uniformity \(\Rightarrow k = \frac{1}{\text{geometric area}} = \frac{1}{4}\).

(c) We have that \[\begin{align*} E[X_2] &= \int_0^2 \left[\int_{-2}^{-x_2} \frac{x_2}{4} dx_1 + \int_{x_2}^{2} \frac{x_2}{4} dx_1 \right] dx_2 = 2\int_0^2 \left[\int_{x_2}^{2} \frac{x_2}{4} dx_1 \right] dx_2\\ &=\frac{1}{2}\int_0^2 x_2 \left[\int_{x_2}^{2} dx_1 \right] dx_2 =\frac{1}{2}\int_0^2 x_2(2-x_2) dx_2 = \frac{1}{2} \left[ x_2^2\bigg|_0^2 - \frac{x_2^3}{3}\bigg|_0^2\right] \\ &= \frac{1}{2} \left[ 4 - \frac{8}{3}\right]= \frac{2}{3} \,. \end{align*}\]


  • Problem 15

(a) We have that \[\begin{align*} \int_0^1 \int_0^1 f_{X_1,X_2}(x_1, x_2) dx_1 dx_2 = 1 = k \left[ \int_0^1 x_1 dx_1 \int_0^1 x_2 dx_2 \right] = k \left[ \frac{x_1^2}{2}\bigg|_0^1 \frac{x_2^2}{2}\bigg|_0^1 \right] = \frac{k}{4} \,. \end{align*}\] Therefore \(k = 4\).

(b) \(X_1, X_2\) are independent, so \[\begin{align*} P(X_2 > 1/2|X_1 = 1/2) &= P(X_2 > 1/2) = \int_{1/2}^1 \left[ \int_0^1 4 x_1 x_2 dx_1\right] dx_2\\ &= \int_{1/2}^1 x_2 \left[ \int_0^1 4 x_1 dx_1\right] dx_2 = \int_{1/2}^1 x_2 \frac{1}{2} dx_2 = 2 \left( \frac{x_2^2}{2}\bigg|_{1/2}^1\right) \\ &= 1^2 - (1/2)^2 = \frac{3}{4} \,. \end{align*}\]

(c) \(X_1, X_2\) are independent, so Cov\((X_1, X_2) = 0\).


  • Problem 16

We have that \[\begin{align*} V[U] = a^T\Sigma a = \begin{bmatrix} 2 & -1 \end{bmatrix} \begin{bmatrix} 3 & 2\\ 2 & 3 \end{bmatrix} \begin{bmatrix} 2 \\ -1 \end{bmatrix} = \begin{bmatrix} 2 & -1 \end{bmatrix} \begin{bmatrix} 4 \\ 1 \end{bmatrix} = 8 - 1 =7 \,. \end{align*}\] Alternatively, \[\begin{align*} V[U] = a_1^2V[X_1] + a_2^2 V[X_2] + 2a_1a_2 \text{Cov}(X_1, X_2) = 4 \cdot 3 + 1 \cdot 3 + 2 (2)(-1)(2) = 15-8 = 7 \,. \end{align*}\]


  • Problem 17

Before we begin, we can write down that \(X \sim \text{Bernoulli}(1/2)\), so \(E[X] = \frac{1}{2}\) and \(V[X] = \frac{1}{2}\left( 1 - \frac{1}{2}\right) = \frac{1}{4}\), and that \(U \vert X \sim \text{Uniform}(X,2)\), so \(E[U \vert X] = \frac{X+2}{2}\) and \(V[U \vert X] = \frac{1}{12}\left( 2 - X^2 \right)\).

We have that \[\begin{align*} V[U] &= E\left[V \left[ U|X \right] \right] + V\left[E \left[ U|X \right] \right] \\ &= E\left[\frac{1}{12}\left( 2 - X^2 \right) \right] + V\left[\frac{X+2}{2} \right] \,, \end{align*}\] where \[\begin{align*} V\left[\frac{X+2}{2}\right] = V\left[\frac{X}{2}\right] = \frac{1}{4}V[X] = \frac{1}{16} \end{align*}\] and \[\begin{align*} E\left[\frac{1}{12}\left( 2 - X^2 \right) \right] &= \frac{1}{12} E[4 - 4X + X^2] = \frac{1}{3} - \frac{1}{6} + \frac{1}{12}\left[V[X] + (E[X])^2 \right]\\ &= \frac{1}{6} + \frac{1}{12} \left[ \frac{1}{4} \frac{1}{4}\right] = \frac{4}{24} + \frac{1}{24} = \frac{5}{24} \,. \end{align*}\] Thus \(V[U] = 5/24 + 1/16 = 13/48\).


  • Problem 18

(a) Is the region rectangular? Yes. And \(f_{X_1,X_2}(x_1,x_2) = f_{X_1}(x_1) f_{X_2}(x_2) = f_{X_1}(x_1)\), with \(f_{X_2}(x_2) = 1\), so \(X_1, X_2\) are independent random variables.

(b) \(f_{X_1,X_2}(x_1, x_2) =f_{X_1}(x_1) f_{X_2}(x_2) = 12 x_1^2(1-x_1) \cdot 1\), therefore \(f_{X_2}(x_2) = 1\) for \(x_2 \in [0,1]\), or \(X_2 \sim \text{Uniform}(0,1)\). Other ways to derive this result include \[\begin{align*} f_{X_2}(x_2) &= \underbrace{\int_0^1 12 x_1^2(1-x_1) dx_1}_{\text{Beta}(3,2)} = 1 \\ &= \int_0^1 12 x_1^2(1-x_1) dx_1 = 12 \left[ \frac{x_1^3}{3}\bigg|_0^1 - \frac{x_1^4}{4}\bigg|_0^1 \right] = \frac{12}{12} = 1 \,. \end{align*}\]

(c) \(X_1 \sim \text{Beta}(3,2)\), hence \(E[X_1] = \alpha/(\alpha + \beta) = \frac{3}{5}\). Another way to derive this result is \[\begin{align*} f_{X_1}(x_1) &= \int_0^1 12 x_1^2(1-x_1) dx_1 = 12 x_1^2(1-x_1) \\ E[X_1] &= \int_0^1 x_1 f_{X_1}(x_1) dx_1 = \int_0^1 12 x_1^3(1-x_1) dx_1 = 12 \left[ \frac{x_1^4}{4}\bigg|_0^1 - \frac{x_1^5}{5}\bigg|_0^1 \right] = \frac{12}{20} = \frac{3}{5} \,. \end{align*}\]


  • Problem 19

(a) The area where \(f_{X_1,X_2}(x_1,x_2)>0\) is a triangle with vertices at (0,0), (1,1), and (2,0). For a bivariate uniform, \(c = 1/\)(the area of the region), so \(c = 1\). Another way to derive this is via brute force: \[\begin{align*} \int_0^1 \left[\int_{x_2}^{2 - x_2} c dx_1\right] dx_2 &= c \int_0^1 (2- x_2) - x_2 dx_2\\ &= c \int_0^1 2(1- x_2) dx_2 = 2c \left[ x_2\bigg|_0^1 - \frac{x_2^2}{2}\bigg|_0^1 \right] = c = 1 \,. \end{align*}\]

(b) \(f_{X_2}(x_2) = \int_{x_2}^{2 - x_2} dx_1 = 2(1- x_2)\) for \(x_2 \in [0,1]\).

(c) \(f_{X_1|X_2}(x_1|x_2) =\frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_2}(x_2)} = 1/[2(1-x_2)]\) for \(x_1 \in [0,2]\), and \(x_2 \in [0,x_1]\) and \(x_2 \in [0,2-x_1]\).


  • Problem 20

(a) We can recognize immediately that \(X_1,X_2\) are independent, thus \(f_{X_1}(x_1) = 2x_1\) for \(x_1 \in [0,1]\). We can also derive this result via brute force: \[\begin{align*} f_{X_1}(x_1) = \int_0^1 4 x_1 x_2 dx_2 = 4x_1 \frac{x_2^2}{2}\bigg|_0^1 = 2x_1 \,. \end{align*}\]

(b) We have that \(f_{X_2|X_1}(x_2|x_1) = \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_1}(x_1)} = \frac{4 x_1 x_2}{2x_1} = 2x_2\) for \(x_1 \in [0,1]\) and \(x_2 \in [0,1]\).

(c) We have that \[\begin{align*} P(X_2 < 1/2 |X_1 = x_1) = \int_0^{1/2} f_{X_2|X_1}(x_2|x_1) dx_2 = \int_0^{1/2}2x_2dx_2 = x_2^2\bigg|_0^{1/2} = \frac{1}{4} \,. \end{align*}\]


  • Problem 21

We begin by noting that \(V[X_1] = E[X_1^2] - E[X_1]^2\), and that \[\begin{align*} E[X_1] &= \int_0^1 \int_0^{1-x_1} 2x_1 dx_2 dx_1 = 2 \int_0^1 x_1(1-x_1) dx_1 = 2B(2,2) = 2 \frac{\Gamma(2)\Gamma(2)}{\Gamma(4)} = \frac{1!1!}{3!} = \frac{1}{3} \,. \\ E[X_1^2] &= \int_0^1 \int_0^{1-x_1} 2x_1 x_1 dx_2 dx_1 = 2 \int_0^1 x_1^2(1-x_1) dx_1 = 2B(3,2) = 2 \frac{2!1!}{4!} = \frac{1}{6} \,. \end{align*}\] Thus \(V[X_1] = 1/6 - \left( 1/3 \right)^2 = 1/18\).


  • Problem 22

We have that \(V[X_1 - X_2] = V[X_1] + V[X_2] - 2\)Cov\((X_1,X_2)\). So we need to compute every part of the formula above: \[\begin{align*} E[X_1] &= \sum \sum x_1 p_{X_1,X_2}(x_1,x_2) = 1 \cdot \frac{2}{9} + 1 \cdot \frac{2}{9} + 2 \cdot \frac{1}{9} = \frac{6}{9}\\ V[X_1] &= E[X_1^2] - E[X_1]^2 = \frac{8}{9} - \frac{36}{81} = \frac{4}{9}\\ E[X_2] &= \sum \sum x_2 p_{X_1,X_2}(x_1,x_2) = 1 \cdot \frac{3}{9} + 1 \cdot \frac{2}{9} + 1 \cdot \frac{1}{9} = \frac{6}{9}\\ V[X_1] &= E[X_1^2] - E[X_1]^2 = \frac{6}{9} - \frac{36}{81} = \frac{2}{9}\\ E[X_1X_2] &= \sum \sum x_1 x_2 p_{X_1,X_2}(x_1,x_2) = 1 \cdot 1 \cdot \frac{2}{9} + 2 \cdot 1 \cdot \frac{1}{9} = \frac{4}{9}\\ \mbox{Cov}(X_1, X_2) &= \frac{4}{9} - \left( \frac{6}{9}\right)^2 = 0 \,. \end{align*}\] Thus \(V[X_1 - X_2] = 6/9 = 2/3\).


  • Problem 23

We recognize that \(X|\beta \sim \text{Gamma}(\alpha, \beta)\) and that \(\beta \sim \text{Exp}(\gamma)\), with \(E[\beta] = \gamma, V[\beta] = \gamma^2\), and \(E[X|\beta] = \alpha \beta\), \(V[X|\beta] = \alpha \beta^2\).

(a) \(E[X] = E\left[ E\left[ X|p\right] \right] = E[\alpha \beta] = \alpha E[\beta] = \alpha \gamma\).

(b) We have that \[\begin{align*} V[X] &= V\left[ E\left[ X|p\right] \right] + E\left[ V\left[ X|p\right] \right] = E[\alpha \beta^2] + V[\alpha \beta]\\ &= \alpha E[\beta^2] + \alpha^2V[\beta]= \alpha \left[ V[\beta] + E[\beta]^2\right] + \alpha^2 \gamma^2\\ &= \alpha \left[ \gamma^2+\gamma^2\right] + \alpha^2 \gamma^2 = (2\alpha + \alpha^2) \gamma^2 \,. \end{align*}\]


  • Problem 24

The region of integration is a triangle with vertices at (0,0), (0,1), and (1,1).

(a) The area of the triangle is 1/2, so \(f_{X_1,X_2}(x_1,x_2) = 2\).

(b) Geometry will not directly help us here; we still have to integrate. The expected value \(E[X_1]\) is \[\begin{align*} E[X_1] &= \int_0^1 \left[ \int_0^{x_1} 2 x_1 dx_2 \right] dx_1 = 2 \int_0^1 x_1 \left[ \int_0^{x_1} dx_2 \right] dx_1 \\ &= 2 \int_0^1 x_1 x_1 dx_1 = 2 \left.\frac{x_1^3}{3}\right|_0^1 = \frac23 \,. \end{align*}\]

(c) We have that Cov[\(X_1,X_2\)] = \(E[X_1X_2] - E[X_1]E[X_2]\). So we need to compute \(E[X_1X_2]\): \[\begin{align*} E[X_1X_2] &= \int_0^1 \left[ \int_0^{x_1} 2 x_1 x_2 dx_2 \right] dx_1 = 2 \int_0^1 x_1 \left[ \int_0^{x_1} x_2 dx_2 \right] dx_1 \\ &= 2 \int_0^1 x_1 \left[ \left.\frac{x_2^2}{2}\right|_0^{x_1} \right] dx_1 = \int_0^1 x_1 x_1^2 dx_1 = \left.\frac{x_1^4}{4}\right|_0^1 = \frac14 \,. \end{align*}\] So the covariance is \(1/4 - (2/3)(1/3) = 9/36 - 8/36 = 1/36\).


  • Problem 25

(a) The long way to do this involves integration. The short way is to see by inspection that \(X_1\) and \(X_2\) are independent (the region of integration is “rectangular,” and \(x_2\) doesn’t directly appear in \(f_{X_1,X_2}(x_1,x_2)\), so \(f_{X_1,X_2}(x_1,x_2)\) can be trivially split into two functions \(g(x_1)h(x_2)\)), and to see that \[\begin{align*} X_1 \sim {\rm Exp}(\beta = 2) ~{\rm and}~ X_2 \sim {\rm Unif}(0,2) \,. \end{align*}\] We know \(f_{X_1,X_2}(x_1,x_2) = f_{X_1}(x_1)f_{X_2}(x_2) = (1/\beta){\rm exp}(-x_1/2) \cdot (1/2)\), so \(k = 1/(2\beta) = 1/4\).

(b) The key here is to realize that since \(X_1\) and \(X_2\) are independent, \(E[X_1]\) is just the expected value of \(f_{X_1}(x_1)\), i.e., it is the expected value of an exponential distribution with mean 2. So: \(E[X_1] = \beta = 2\).


  • Problem 26

The region of integration is a triangle with vertices (0,0), (2,0), and (1,1). The area of the region of integration is 1, so \(f_{X_1,X_2}(x_1,x_2) = 1\).

(a) We have that \[\begin{align*} f_{X_2}(x_2) &= \int_{x_2}^{2-x_2} dx_1 = 2(1-x_2) ~~~ x_2 \in [0,1] \,. \end{align*}\]

(b) We have that \[\begin{align*} f_{X_1 \vert X_2}(x_1 \vert x_2) &= \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_2}(x_2)} = \frac{1}{2(1-x_2)} ~~~ x_1 \in [x_2,2-x_2] ~~~ x_2 \in [0,1] \,. \end{align*}\]

(c) The new region of integration is a triangle with vertices at (0,0), (1,0), and (1/2,1/2). Since we are dealing with a bivariate uniform, we know that the probability of sampling data from this triangle is the ratio of its area to the area of the total region of integration for the pdf. Thus \(P(X_1 + X_2 \leq 1) = (1/2 \cdot 1 \cdot 1/2)/1 = 1/4\).


  • Problem 27

We are given that \(X \vert \mu \sim \mathcal{N}(\mu,1)\) and that \(\mu \sim \mathcal{N}(0,1)\). We thus know that \(E[X \vert \mu] = \mu\) and \(V[X \vert \mu] = 1\), while \(E[\mu] = 0\) and \(V[\mu] = 1\). Thus \[\begin{align*} V[X] &= V\left[E[X \vert \mu]\right] + E\left[V[X \vert \mu]\right] = V[\mu] + E[1] = 1 + 1 = 2 \,. \end{align*}\]


  • Problem 28

(a) The conditional expected value is \[\begin{align*} E[X_1 \vert X_2=1] &= (x_1 = 0) \cdot p(x_1=0 \vert x_2=1) + (x_1 = 1) \cdot p(x_1=1 \vert x_2=1) \\ &= 1 \cdot \frac{p(1,1)}{p_2(1)} = 1 \cdot \frac{0.1}{0.3+0.1} = 0.25 \,. \end{align*}\]

(b) The conditional variance is \[\begin{align*} V[X_1 \vert X_2=1] &= E[X_1^2 \vert X_2=1] - (E[X_1 \vert X_2=1])^2 \,. \end{align*}\] We know the second term. The first term can be derived in a manner similar to above: \[\begin{align*} E[X_1^2 \vert X_2=1] &= \ldots = 1^2 \cdot \frac{p(1,1)}{p_2(1)} = 1 \cdot \frac{0.1}{0.3+0.1} = 0.25 \,. \end{align*}\] Thus the conditional variance is \(V[X_1 \vert X_2=1] = 0.25 - 0.25^2 = 1/4 - 1/16 = 3/16\).


  • Problem 29

(a) Because the boundary of the domain is not rectangular (see \(x_1 + x_2 \leq 2\)), \(X_1\) and \(X_2\) are not independent.

(b) We need the distribution to integrate to 1: \[\begin{align*} 1 &= \int_{x_1} \int_{x_2} c x_1^2 dx_2 dx_1 \,. \end{align*}\] The domain is a triangle with vertices (0,0), (2,0), and (0,2), and thus there is no real advantage gained by utilizing either order of integration, so we’ll keep the integral over \(x_2\) as our “inner” integral: \[\begin{align*} \int_0^2 \int_0^{2-x_1} c x_1^2 dx_2 dx_1 &= c \int_0^2 x_1^2 \left( \int_0^{2-x_1} dx_2 \right) dx_1 \\ &= c \int_0^2 x_1^2 (2 - x_1) dx_1 \\ &= c \left( 2 \int_0^2 x_1^2 dx_1 - \int_0^2 x_1^3 dx_1 \right) \\ &= c \left( 2 \left. \frac{x_1^3}{3}\right|_0^2 - \left. \frac{x_1^4}{4}\right|_0^2 \right) \\ &= c \left( \frac{16}{3} - 4 \right) = c\frac{4}{3} \,. \end{align*}\] Hence \(c = 3/4\).

(c) We actually already derived the marginal distribution above: \[\begin{align*} f_{X_1}(x_1) &= c x_1^2 \int_0^{2-x_1} dx_2 \\ &= c x_1^2 (2 - x_1) ~~~ x_1 \in [0,2] \,. \end{align*}\]

(d) The conditional distribution is \[\begin{align*} f_{X_2 \vert X_1}(x_2 \vert x_1) = \frac{f_{X_1,X_2}(x_1,x_2)}{f_{X_1}(x_1)} = \frac{cx_1^2}{cx_1^2(2-x_1)} = \frac{1}{2-x_1} ~~~ x_2 \in [0,2-x_1] \,. \end{align*}\]

(e) Given the domain and the flat pdf, we know that \(f_{X_2 \vert X_1}(x_2 \vert x_1)\) is a “uniform” distribution.


  • Problem 30

(a) We need to determine \(E[X_1]\), \(E[X_2]\), and \(E[X_1X_2]\): \[\begin{align*} E[X_1] &= 1 \cdot (0.1 + 0.5) = 0.6 \\ E[X_2] &= 1 \cdot (0.1 + 0.5) = 0.6 \\ E[X_1X_2] &= 1 \cdot 1 \cdot 0.5 = 0.5 \,. \end{align*}\] Thus Cov(\(X_1,X_2\)) = \(E[X_1X_2] - E[X_1]E[X_2] = 0.5 - 0.6^2 = 0.14\).

(b) Now we need \(E[X_1^2]\) and \(E[X_2^2]\): \[\begin{align*} E[X_1^2] &= 1^2 \cdot (0.1 + 0.5) = 0.6 \\ E[X_2^2] &= 1^2 \cdot (0.1 + 0.5) = 0.6 \,. \end{align*}\] Thus \[\begin{align*} V[X_1] &= E[X_1^2] - (E[X_1])^2 = 0.6 - 0.6^2 = 0.24 \\ V[X_2] &= E[X_2^2] - (E[X_2])^2 = 0.6 - 0.6^2 = 0.24 \,, \end{align*}\] and \(\sigma_1\sigma_2 = \sqrt{V[X_1]V[X_2]} = 0.24\), and \(\rho_{X_1,X_2} = 0.14/0.24 = 7/12\).

(c) We have that \[\begin{align*} V[Y] &= a_1^2 V[X_1] + a_2^2 V[X_2] + 2 a_1 a_2 \mbox{Cov}(X_1,X_2) \\ &= 4 \cdot 0.24 + 1 \cdot 0.24 - 2 \cdot 2 \cdot 1 \cdot 0.14 \\ &= 1.2 - 4 \cdot 0.14 = 0.64 \,. \end{align*}\]