Problem 9.48 (2 points)
Let \(Y_1,\dotsc,Y_n\) denote a random sample from a normal distribution with mean \(\mu\) and variance \(\sigma^2\). In exercise \(9.30\)(b), we showed that if \(\mu\) is known and \(\sigma^2\) is unknown then \(U = \sum_{i=1}^{n}(Y_i - \mu)^2\) is sufficient for \(\sigma^2\). By theorem \(7.2\), \(W = U/\sigma^2\) has a \(\chi^2\)-distribution with \(\nu = n\) degrees of freedom, so \[
E[U] = E[\sigma^2 W] = \sigma^2 E[W] = n\sigma^2.
\] Thus \(\frac{1}{n}U = \frac{1}{n}\sum_{i=1}^{n}(Y_i - \mu)^2\) is an unbiased estimator for \(\sigma^2\). Since we arrived at the sufficient statistic \(U\) via the factorization criterion, \(U\) best summarizes the information about \(\sigma^2\). Therefore \(\frac{1}{n}U\) is a MVUE for \(\sigma^2\).
Problem 9.50 (2 points)
In exercise \(9.32\), \(Y_1,\dotsc,Y_n\) denotes a random sample from a Rayleigh distribution with parameter \(\theta\). Each \(Y_i\) has density function \[
f_Y(y \mid \theta) =
\begin{cases}
\frac{2 y}{\theta} e^{-y^2/\theta}, & y > 0 \\
0, & \textrm{elsewhere}
\end{cases}.
\] The factorization criterion led to the sufficient statistic \(U = \sum_{i=1}^{n} Y_i^2\). Note that if \(w = y^2\) then \[
f_Y(y)\,dy = \frac{2 y}{\theta}e^{-y^2/\theta}\,dy
= \frac{1}{\theta} e^{-w/\theta}\,dw
= f_W(w)\,dw,
\] which shows that \(W=Y^2\) is an exponential random variable with mean \(\theta\), thus \(E[Y^2] = E[W] = \theta\). We therefore have that \(E[U] = E[\sum_{i=1}^{n} Y_i^2] = \sum_{i=1}^{n} \theta = n\theta\), which implies that \(\frac{1}{n}U\) is an unbiased estimator from \(\theta\). Since \(U\) best summarizes the information about \(\theta\), \(\frac{1}{n}U = \frac{1}{n}\sum_{i=1}^{n} Y_i^2\) is a MVUE for \(\theta\).
Problem 9.52 (10 points)
Let \(Y_1, Y_2,\dotsc,Y_n\) denote a random sample from the probability distribution whose density function is \[
f_Y(y \mid \theta) =
\begin{cases}
\theta\, y^{\theta - 1}, & 0 < y < 1 ; \theta > 0 \\
0, & \textrm{elsewhere}
\end{cases}.
\]
An exponential family of distributions has a density that can be written in the form
\[
f(y \mid \theta) =
\begin{cases}
a(\theta)\,b(y)\, \exp\bigl({-c(\theta)\,d(y)}\bigr), & a < y < b \\
0, & \textrm{elsewhere}
\end{cases}.
\] Applying the factorization criterion we showed, in exercise 9.37, that
\(U = \sum_{i=1}^{n} d(Y_i)\) is a sufficient statistic for
\(\theta\).
-
Since \[
\theta\,y^{\theta - 1} = \theta\,\exp\bigl((\theta - 1)\ln y\bigr)
= \theta\,\exp\bigl(-(\theta - 1)(-\ln y)\bigr)
\] we see that \(f_Y(y \mid \theta)\) belongs to an exponential family with \(d(y) = - \ln y\). Therefore, \(U = -\sum_{i=1}^{n}\ln Y_i\) is a sufficient statistic for \(\theta\).
-
Let \(w = - \ln y\), which is equivalent to \(y = e^{-w}\). Then \(dy = - e^{-w}\,dw\) and \[
f_Y(y \mid \theta)\,dy =
\theta\,y^{\theta - 1}\,dy =
\theta\,e^{-(\theta - 1) w}\,e^{-w}\,dw =
\theta\,e^{-\theta\,w}\,dw =
f_W(w \mid \theta)\,dw.
\] Therefore, \(W = -\ln Y\) has an exponential distribution with expected value \(E[W] = 1/\theta\).
-
Let \(t = 2\theta\,w\), then \(dt = 2\theta\,dw\) and \[
f_W(w \mid \theta)\,dw =
\theta\,e^{-\theta\,w}\,dw =
\frac{1}{2}e^{-t/2}\,dt =
f_T(t)\,dt.
\] Thus, \(T = 2\theta\,W\) has a \(\chi^2\)-distribution with \(\nu = 2\) degrees of freedom. Therefore \[
2\theta\,\sum_{i=1}^{n} W_i = \sum_{i=1}^{n} T_i
\] has a \(\chi^2\)-distribution with \(\nu = 2n\) degrees of freedom.
-
Exercise \(4.90\)(d) showed that if \(X\) has a \(\chi^2\)-distribution with \(\nu\) degrees of freedom then \(E\left[X^{-1}\right] = 1/(\nu - 2)\). This implies \[
E\left[
\left(2\theta \sum_{i=1}^{n} W_i\right)^{-1}\right] =
\frac{1}{2(n-1)}.
\]
-
From part (d), we see that \(E\left[ (n - 1) / \sum_{i=1}^{n} W_i\right] = \theta \), making \((n - 1) / \sum_{i=1}^{n} W_i\) an unbiased estimator for \(\theta\). Since \(\sum_{i=1}^{n} W_i = - \sum_{i=1}^{n} \ln Y_i\) best summarizes the information about \(\theta\), \[
\frac{n-1}{\sum_{i=1}^{n} W_i} =
\frac{n-1}{-\sum_{i=1}^{n} \ln Y_i}
\] is a MVUE for \(\theta\).
Problem 9.54 (2 points)
In exercise \(9.43\), the factorization criterion shows that \(Y_{(1)} = \min(Y_1,Y_2,\dotsc,Y_n)\) is a sufficient statistic for \(\theta\), where \(Y_1,Y_2,\dotsc,Y_n\) denotes a random sample taken from a probability distribution whose density function is \[
f_Y(y \mid \theta) =
\begin{cases}
e^{-(y-\theta)}, & y \ge \theta \\
0 , & \textrm{elsewhere}
\end{cases}.
\] Notice that \[
1 - F_Y(y) = P(Y > y) =
\begin{cases}
e^{-(y-\theta)}, & y \ge \theta \\
1 , & \textrm{elsewhere}
\end{cases},
\] which implies that \[
P(Y_{(1)} > y) = [1 - F_Y(y)]^n =
\begin{cases}
e^{-n(y - \theta)}, & y \ge \theta \\
1 , & \textrm{elsewhere}
\end{cases}.
\] Now, let \(X = Y_{(1)} - \theta\). Then \[
P(X > x) = P(Y_{(1)} > x + \theta) =
\begin{cases}
e^{-n x}, & x \ge 0 \\
1, & \textrm{elsewhere}
\end{cases},
\] which shows that \(X\) has an exponential probability distribution with \(\beta = E[X] = 1/n\). Therefore \[
E[Y_{(1)} - 1/n] = E[\theta + X - 1/n] = \theta + 1/n - 1/n = \theta,
\] and since \(Y_{(1)}\) best summarizes the information about \(\theta\) we conclude that \(\hat{\theta} = Y_{(1)} - 1/n\) is a MVUE for \(\theta\).
Problem 9.56 (4 points)
Let
\(Y_1, Y_2, \dotsc, Y_n\) be a random sample from a normal distribution with mean
\(\mu\) and variance
\(\sigma^2 = 1\).
-
From exercise \(9.30\)(a), \(\overline{Y}\) is a sufficient statistic that best summarized the information about \(\mu\). Since \(E\left[\overline{Y}\right] = \mu\), \(\overline{Y}\) is a MVUE for \(\mu\). Now, \[
E\left[\overline{Y}^2\right] = V\left[\overline{Y}\right] + E\left[\overline{Y}\right]^2 =
\frac{1}{n}\sigma^2 + \mu^2 = \frac{1}{n} + \mu^2,
\] which implies that \(E\left[\overline{Y}^2 - \frac{1}{n}\right] = \mu^2\). Therefore, \(\widehat{\mu^2} = \overline{Y}^2 - 1/n\) is an unbiased estimator for \(\mu^2\), and since \(\overline{Y}\) is an MVUE for \(\mu\), \(\overline{Y}^2 - 1/n\) is an MVUE for \(\mu^2\).
-
\[
V\left[\widehat{\mu^2}\right] = V\left[\overline{Y}^2 - 1/n\right] =
V\left[\overline{Y}^2\right] = E\left[\overline{Y}^4\right] - E\left[\overline{Y}\right]^2.
\] Since \(\overline{Y}\) has a normal distribution with \(E[\overline{Y}] = \mu\) and \(V[\overline{Y}] = 1/n\), the moment generating function for \(\overline{Y}\) is \(m(t) = \exp[\mu t + t^2 / (2n)]\). With the aid of software, we calculate the fourth derivative of \(m(t)\) and evaluate at \(t = 0\) to obtain
\[\begin{align*}
E\left[\overline{Y}^4\right] &=
m^{(4)}(0) = \mu^4 + 6\mu^2/n + 3/n^2 \\
\implies
V\left[\widehat{\mu^2}\right] &=
(\mu^4 + 6\mu^2/n + 3/n^2) - (\mu^2 + 1/n)^2 \\
\implies
V\left[\widehat{\mu^2}\right] &= 2 (2 n \mu^2 + 1) / n^2.
\end{align*}\]
Problem 9.57 (6 points)
Let
\(Y_1, Y_2, \dotsc, Y_n\) be independent Bernoulli random variables with
\[
p(y_i \mid p) = p^{y_i} (1 - p)^{1 - y_i}, \quad y_i = 0, 1.
\]
-
Let \[
T =
\begin{cases}
1 , & \textrm{if \(Y_1 = 1\) and \(Y_2 = 0\)} \\
0 , & \textrm{otherwise}
\end{cases}.
\] Then
\[\begin{align*}
E[T] &= 0\cdot P(T=0) + 1\cdot P(T = 1) = P(T = 1) \\
&= P(Y_1 = 1, Y_2 = 0) = P(Y_1 = 1)\cdot P(Y_2 = 0) \\
&= p (1 - p).
\end{align*}\]
Therefore, \(T\) is an unbiased estimator for \(p (1-p)\).
-
Let \(W = \sum_{i=1}^{n} Y_i\). Note that \(Y_1 = 1\) and \(Y_2 = 0\) implies that \(1 \le W \le n-1\). Thus, for \(1 \le w \le n - 1\)
\[\begin{align*}
P(T = 1 \mid W = w) &= \frac{P(T = 1, W = w)}{P(W = w)} \\
&= \frac{P(T = 1, \sum_{i=3}^{n} Y_i = w - 1)}{P(W = w)} \\
&= \frac{P(T = 1)\cdot P(\sum_{i=3}^{n} Y_i = w - 1)}{P(W = w)} \\
&= \frac{p (1 - p)\binom{n-2}{w-1}p^{w - 1}(1 - p)^{n - w - 1}}
{\binom{n}{w}p^{w}(1 - p)^{n-w}} \\
&= \frac{\binom{n-2}{w-1}}{\binom{n}{w}} \\
&= \frac{w (n - w)}{n (n-1)}.
\end{align*}\]
-
Now, for \(1 \le w \le n - 1\),
\[\begin{align*}
E[ T \mid W = w] &= 0\cdot P(T = 0 \mid W = w) + 1\cdot P(T = 1 \mid W =w) \\
&= P(T = 1 \mid W = w) \\
&= \frac{w (n - w)}{n (n - 1)}.
\end{align*}\]
Therefore, \[
E[T \mid W] = \frac{W (n - W)}{n (n - 1)} =
\frac{n}{n - 1} \overline{Y} (1 - \overline{Y}),
\] since \(W = n \overline{Y}\).
In example 9.6 (page 437), the factorization criterion shows that \(W = \sum_{i=1}^{n} Y_i\) is a sufficient statistic for \(p\) that best summarizes the information about \(p\) contained in the sample (i.e. \(W\) is minimal). The Rao-Blackwell theorem implies, therefore, that \[
\frac{n}{n-1} \overline{Y} (1 - \overline{Y})
\] is a MVUE for \(E[T] = p (1 - p)\).
Problem 9.58 (4 points)
-
Let \(Y_1,Y_2,\dotsc,Y_n\) be a random sample from a Bernoulli distribution with \(P(Y = y) = p^{y} (1 - p)^{1 - y}\), for \(y = 0, 1\). Example \(9.6\) shows that \(L(y_1,y_2,\dotsc,y_n \mid p) = p^{\sum_{i=1}^{n}y_i} (1 - p)^{n - \sum_{i=1}^{n}y_i}\). Therefore,
\[\begin{align*}
\frac{L(x_1,x_2,\dotsc,x_n \mid p)}{L(y_1,y_2,\dotsc,y_n \mid p)}
&= p^{\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i} (1 - p)^{\sum_{i=1}^{n}y_i - \sum_{i=1}^{n}x_i} \\
&= \left(\frac{p}{1 - p}\right)^{\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i}.
\end{align*}\]
Differentiation with respect to \(p\) shows that this function is independent of \(p\) if and only if the exponent, \(\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i\), is zero.
Letting \(g(x_1,x_2,\dotsc,x_n) = \sum_{i=1}^{n} x_i\), the Lehmann-Scheffé method shows that \[
U = g(Y_1,Y_2,\dotsc,Y_n) = \sum_{i=1}^{n} Y_i
\] is a minimal sufficient statistic for \(p\). This is the same statistic we found in example \(9.6\).
-
Let \(Y_1,\dotsc,Y_n\) be a random sample from the Weibull density in example \(9.7\). Then \[
L(y_1,\dotsc,y_n \mid \theta) = \left(\frac{2}{\theta}\right)^{n}
\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} y_i^2\right) \prod_{i=1}^{n}y_i.
\] Therefore,
\[\begin{align*}
\frac{L(x_1,\dotsc,x_n \mid \theta)}{L(y_1,\dotsc,y_n \mid \theta)}
&= \frac{\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} x_i^2\right)}
{\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} y_i^2\right)}
\cdot
\frac{\prod_{i=1}^{n}x_i}{\prod_{i=1}^{n}y_i} \\
&=
\left(\frac{\prod_{i=1}^{n}x_i}{\prod_{i=1}^{n}y_i}\right)
\exp\left[-\frac{1}{\theta}
\left(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2\right)\right].
\end{align*}\]
Differentiation with respect to \(\theta\) shows that the ratio is independent of \(\theta\) if and only if \(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2 = 0\).
Letting \(g(x_1,\dotsc,x_n) = \sum_{i=1}^{n} x_i^2\), the Lehmann-Scheffé method shows that \[
U = g(Y_1,\dotsc,Y_n) = \sum_{i=1}^{n} Y_i^2
\] is a minimal sufficient statistic for \(\theta\).
Problem 9.59 (2 points)
Let \(Y_1,\dotsc,Y_n\) denote a sample from a normal population with mean \(\mu\) and variance \(\sigma^2\). Example \(9.8\) (page \(438\)) shows \[
L(y_1,\dotsc,y_n \mid \mu, \sigma^2) =
\left(2\pi\sigma^2\right)^{-n/2}\cdot
\exp\left(-\frac{n \mu^2}{2\sigma^2}\right)\cdot
\exp\left[-\frac{1}{2\sigma^2}
\left(\sum_{i=1}^{n} y_i^2 - 2\mu\sum_{i=1}^{n} y_i\right)
\right],
\] which implies that \[
\frac{L(x_1,\dotsc,x_n \mid \mu, \sigma^2)}{L(y_1,\dotsc,y_n \mid \mu, \sigma^2)} =
\exp\left[
-\frac{1}{2\sigma^2}\left(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2\right)
\right]\cdot
\exp\left[
\frac{\mu}{\sigma^2}\left(\sum_{i=1}^{n} x_i - \sum_{i=1}^{n} y_i\right)
\right].
\] A calculation shows that the partial derivatives of this ratio with respect to \(\mu\) and \(\sigma^2\) are both zero if and only if \(\sum_{i=1}^{n} x_i - \sum_{i=1}^{n} y_i = 0\) and \(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2 = 0\). Hence, we see that the ratio is independent of \(\mu\) and \(\sigma^2\) if and only if \(\sum_{i=1}^{n} x_i = \sum_{i=1}^{n} y_i\) and \(\sum_{i=1}^{n} x_i^2 = \sum_{i=1}^{n} y_i^2\), therefore the Lehmann-Scheffé methods shows that \[
\sum_{i=1}^{n} Y_i,\quad \sum_{i=1}^{n} Y_i^2
\] are jointly a minimal sufficient statistic for \(\mu\) and \(\theta\).
Problem 9.60 (2 points)
Suppose that
\(U\) is a minimal sufficient statistic for
\(\theta\), and
\(g_1(U)\) and
\(g_2(U)\) are both unbiased estimators for
\(\theta\). Suppose that
\(f_U(u \mid \theta)\) is a complete family of density functions, and suppose that
\(g_1(u)\) and
\(g_2(u)\) are continuous. Then
\((g_1 - g_2)(u)\) is continuous, and for all
\(\theta\)
\[\begin{align*}
E[(g_1 - g_2)(U)] &=
E[g_1(U) - g_2(U)] \\
&= E[g_1(U)] - E[g_2(U)] \\
&= \theta - \theta \\
&= 0.
\end{align*}\]
Now, completeness implies \((g_1 - g_2)(u) = 0\) for all \(u\), which implies that \(g_1(u) = g_2(u)\) for all \(u\) and therefore that \(g_1(U) = g_2(U)\). Since \(U\) is minimal, the Rao-Blackwell theorem implies that \(g_1(U)\) and \(g_2(U)\) are MVUE. So the additional property of completeness implies that the MVUE is unique.
---
title: "Solutions to Homework Assignment 9"
output: html_notebook
---

##Problem 9.48 (2 points)
Let \(Y_1,\dotsc,Y_n\) denote a random sample from a normal distribution
with mean \(\mu\) and variance \(\sigma^2\). In exercise \(9.30\)(b),
we showed that if \(\mu\) is known and \(\sigma^2\) is unknown then
\(U = \sum_{i=1}^{n}(Y_i - \mu)^2\) is sufficient for \(\sigma^2\).
By theorem \(7.2\), \(W = U/\sigma^2\) has a \(\chi^2\)-distribution
with \(\nu = n\) degrees of freedom, so
\[
E[U] = E[\sigma^2 W] = \sigma^2 E[W] = n\sigma^2.
\]
Thus \(\frac{1}{n}U = \frac{1}{n}\sum_{i=1}^{n}(Y_i - \mu)^2\) is an
unbiased estimator for \(\sigma^2\). Since we arrived at the sufficient statistic \(U\)
via the factorization criterion, \(U\) best summarizes the information about
\(\sigma^2\). Therefore \(\frac{1}{n}U\) is a MVUE for \(\sigma^2\).

##Problem 9.50 (2 points)
In exercise \(9.32\), \(Y_1,\dotsc,Y_n\) denotes a random sample from a
Rayleigh distribution with parameter \(\theta\). Each \(Y_i\) has
density function
\[
f_Y(y \mid \theta) =
\begin{cases}
\frac{2 y}{\theta} e^{-y^2/\theta}, & y > 0 \\
0, & \textrm{elsewhere}
\end{cases}.
\]
The factorization criterion led to the sufficient statistic
\(U = \sum_{i=1}^{n} Y_i^2\). Note that if \(w = y^2\) then
\[
f_Y(y)\,dy = \frac{2 y}{\theta}e^{-y^2/\theta}\,dy
= \frac{1}{\theta} e^{-w/\theta}\,dw
= f_W(w)\,dw,
\]
which shows that \(W=Y^2\) is an exponential random variable
with mean \(\theta\), thus \(E[Y^2] = E[W] = \theta\).
We therefore have that 
\(E[U] = E[\sum_{i=1}^{n} Y_i^2] = \sum_{i=1}^{n} \theta = n\theta\),
which implies that \(\frac{1}{n}U\) is an unbiased estimator from 
\(\theta\). Since \(U\) best summarizes the information about
\(\theta\), 
\(\frac{1}{n}U = \frac{1}{n}\sum_{i=1}^{n} Y_i^2\) is a MVUE for \(\theta\).

##Problem 9.52 (10 points)
Let \(Y_1, Y_2,\dotsc,Y_n\) denote a random sample from the probability
distribution whose density function is
\[
f_Y(y \mid \theta) =
\begin{cases}
\theta\, y^{\theta - 1}, & 0 < y < 1 ; \theta > 0 \\
0, & \textrm{elsewhere}
\end{cases}.
\]

An exponential family of distributions has a density that can be written
in the form
\[
f(y \mid \theta) =
\begin{cases}
a(\theta)\,b(y)\, \exp\bigl({-c(\theta)\,d(y)}\bigr), & a < y < b \\
0, & \textrm{elsewhere}
\end{cases}.
\]
Applying the factorization criterion we showed, in exercise 9.37,
that \(U = \sum_{i=1}^{n} d(Y_i)\) is a sufficient statistic for
\(\theta\).
<ol type="a">
<li>
Since 
\[
\theta\,y^{\theta - 1} = \theta\,\exp\bigl((\theta - 1)\ln y\bigr) 
= \theta\,\exp\bigl(-(\theta - 1)(-\ln y)\bigr)
\]
we see that \(f_Y(y \mid \theta)\) belongs to an exponential family
with \(d(y) = - \ln y\). Therefore, \(U = -\sum_{i=1}^{n}\ln Y_i\) is 
a sufficient statistic for \(\theta\).
</li>
<li>
Let \(w = - \ln y\), which is equivalent to \(y = e^{-w}\). Then
\(dy = - e^{-w}\,dw\) and
\[
f_Y(y \mid \theta)\,dy =
\theta\,y^{\theta - 1}\,dy =
\theta\,e^{-(\theta - 1) w}\,e^{-w}\,dw =
\theta\,e^{-\theta\,w}\,dw =
f_W(w \mid \theta)\,dw.
\]
Therefore, \(W = -\ln Y\) has an exponential distribution with
expected value \(E[W] = 1/\theta\).
</li>
<li>
Let \(t = 2\theta\,w\), then \(dt = 2\theta\,dw\) and
\[
f_W(w \mid \theta)\,dw =
\theta\,e^{-\theta\,w}\,dw =
\frac{1}{2}e^{-t/2}\,dt =
f_T(t)\,dt.
\]
Thus, \(T = 2\theta\,W\) has a \(\chi^2\)-distribution with \(\nu = 2\)
degrees of freedom. Therefore
\[
2\theta\,\sum_{i=1}^{n} W_i = \sum_{i=1}^{n} T_i
\]
has a \(\chi^2\)-distribution with \(\nu = 2n\) degrees of freedom.
</li>
<li>
Exercise \(4.90\)(d) showed that if \(X\) has a \(\chi^2\)-distribution
with \(\nu\) degrees of freedom
then \(E\left[X^{-1}\right] = 1/(\nu - 2)\). This implies
\[
E\left[
\left(2\theta \sum_{i=1}^{n} W_i\right)^{-1}\right] =
\frac{1}{2(n-1)}.
\]
</li>
<li>
From part (d), we see that 
\(E\left[
(n - 1) / \sum_{i=1}^{n} W_i\right] =
\theta
\), making \((n - 1) / \sum_{i=1}^{n} W_i\) an unbiased estimator
for \(\theta\). Since \(\sum_{i=1}^{n} W_i = - \sum_{i=1}^{n} \ln Y_i\)
best summarizes the information about \(\theta\),
\[
\frac{n-1}{\sum_{i=1}^{n} W_i} =
\frac{n-1}{-\sum_{i=1}^{n} \ln Y_i}
\]
is a MVUE for \(\theta\).
</li>
</ol>

##Problem 9.54 (2 points)
In exercise \(9.43\), the factorization criterion shows that
\(Y_{(1)} = \min(Y_1,Y_2,\dotsc,Y_n)\) is a sufficient statistic 
for \(\theta\), where \(Y_1,Y_2,\dotsc,Y_n\) denotes a random sample
taken from a probability distribution whose density function is
\[
f_Y(y \mid \theta) =
\begin{cases}
e^{-(y-\theta)}, & y \ge \theta \\
0 , & \textrm{elsewhere}
\end{cases}.
\]
Notice that
\[
1 - F_Y(y) = P(Y > y) =
\begin{cases}
e^{-(y-\theta)}, & y \ge \theta \\
1 , & \textrm{elsewhere}
\end{cases},
\]
which implies that
\[
P(Y_{(1)} > y) = [1 - F_Y(y)]^n =
\begin{cases}
e^{-n(y - \theta)}, & y \ge \theta \\
1 , & \textrm{elsewhere}
\end{cases}.
\]
Now, let \(X = Y_{(1)} - \theta\). Then
\[
P(X > x) = P(Y_{(1)} > x + \theta) =
\begin{cases}
e^{-n x}, & x \ge 0 \\
1, & \textrm{elsewhere}
\end{cases},
\]
which shows that \(X\) has an exponential probability distribution
with \(\beta = E[X] = 1/n\). Therefore
\[
E[Y_{(1)} - 1/n] = E[\theta + X - 1/n] = \theta + 1/n - 1/n = \theta,
\]
and since \(Y_{(1)}\) best summarizes the information about \(\theta\)
we conclude that \(\hat{\theta} = Y_{(1)} - 1/n\) is a MVUE for \(\theta\).

##Problem 9.56 (4 points)
Let \(Y_1, Y_2, \dotsc, Y_n\) be a random sample from a normal distribution with
mean \(\mu\) and variance \(\sigma^2 = 1\). 
<ol type="a">
<li>
From exercise \(9.30\)(a), \(\overline{Y}\) is a sufficient statistic that
best summarized the information about \(\mu\).
Since \(E\left[\overline{Y}\right] = \mu\), \(\overline{Y}\) is a MVUE for \(\mu\).
Now, 
\[
E\left[\overline{Y}^2\right] = V\left[\overline{Y}\right] + E\left[\overline{Y}\right]^2 = 
\frac{1}{n}\sigma^2 + \mu^2 = \frac{1}{n} + \mu^2,
\]
which implies that \(E\left[\overline{Y}^2 - \frac{1}{n}\right] = \mu^2\).
Therefore, \(\widehat{\mu^2} = \overline{Y}^2 - 1/n\) is an unbiased
estimator for \(\mu^2\), and since \(\overline{Y}\) is an MVUE for
\(\mu\), \(\overline{Y}^2 - 1/n\) is an MVUE for \(\mu^2\).
</li>
<li>
\[
V\left[\widehat{\mu^2}\right] = V\left[\overline{Y}^2 - 1/n\right] =
V\left[\overline{Y}^2\right] = E\left[\overline{Y}^4\right] - E\left[\overline{Y}\right]^2.
\]
Since \(\overline{Y}\) has a normal distribution with \(E[\overline{Y}] = \mu\) and
\(V[\overline{Y}] = 1/n\), the moment generating function for \(\overline{Y}\) is
\(m(t) = \exp[\mu t + t^2 / (2n)]\). With the aid of software, we calculate the 
fourth derivative of \(m(t)\) and evaluate at \(t = 0\) to obtain
\begin{align*}
E\left[\overline{Y}^4\right] &=
m^{(4)}(0) = \mu^4 + 6\mu^2/n + 3/n^2 \\
\implies 
V\left[\widehat{\mu^2}\right] &=
(\mu^4 + 6\mu^2/n + 3/n^2) - (\mu^2 + 1/n)^2 \\
\implies
V\left[\widehat{\mu^2}\right] &= 2 (2 n \mu^2 + 1) / n^2.
\end{align*}
</li>
</ol>

##Problem 9.57 (6 points)
Let \(Y_1, Y_2, \dotsc, Y_n\) be independent Bernoulli random variables with
\[
p(y_i \mid p) = p^{y_i} (1 - p)^{1 - y_i}, \quad y_i = 0, 1.
\]
<ol type="a">
<li>
Let
\[
T =
\begin{cases}
1 , & \textrm{if \(Y_1 = 1\) and \(Y_2 = 0\)} \\
0 , & \textrm{otherwise}
\end{cases}.
\]
Then
\begin{align*}
E[T] &= 0\cdot P(T=0) + 1\cdot P(T = 1) = P(T = 1) \\
&= P(Y_1 = 1, Y_2 = 0) = P(Y_1 = 1)\cdot P(Y_2 = 0) \\
&= p (1 - p).
\end{align*}
Therefore, \(T\) is an unbiased estimator for \(p (1-p)\).
</li>
<li>
Let \(W = \sum_{i=1}^{n} Y_i\).  Note that \(Y_1 = 1\) and \(Y_2 = 0\) implies that 
\(1 \le W \le n-1\). Thus, for \(1 \le w \le n - 1\)
\begin{align*}
P(T = 1 \mid W = w) &= \frac{P(T = 1, W = w)}{P(W = w)} \\
&= \frac{P(T = 1, \sum_{i=3}^{n} Y_i = w - 1)}{P(W = w)} \\
&= \frac{P(T = 1)\cdot P(\sum_{i=3}^{n} Y_i = w - 1)}{P(W = w)} \\
&= \frac{p (1 - p)\binom{n-2}{w-1}p^{w - 1}(1 - p)^{n - w - 1}}
{\binom{n}{w}p^{w}(1 - p)^{n-w}} \\
&= \frac{\binom{n-2}{w-1}}{\binom{n}{w}} \\
&= \frac{w (n - w)}{n (n-1)}.
\end{align*}
</li>
<li>
Now, for \(1 \le w \le n - 1\), 
\begin{align*}
E[ T \mid W = w] &= 0\cdot P(T = 0 \mid W = w) + 1\cdot P(T = 1 \mid W =w) \\
&= P(T = 1 \mid W = w) \\
&= \frac{w (n - w)}{n (n - 1)}.
\end{align*}
Therefore,
\[
E[T \mid W] = \frac{W (n - W)}{n (n - 1)} =
\frac{n}{n - 1} \overline{Y} (1 - \overline{Y}),
\]
since \(W = n \overline{Y}\).

In example 9.6 (page 437), the factorization criterion shows that
\(W = \sum_{i=1}^{n} Y_i\) is a sufficient statistic for \(p\) that
best summarizes the information about \(p\) contained in the sample
(i.e. \(W\) is minimal). The Rao-Blackwell theorem implies, therefore, that
\[
\frac{n}{n-1} \overline{Y} (1 - \overline{Y})
\]
is a MVUE for \(E[T] = p (1 - p)\).
</li>
</ol>

##Problem 9.58 (4 points)
<ol type="a">
<li>
Let \(Y_1,Y_2,\dotsc,Y_n\) be a random sample from a Bernoulli
distribution with \(P(Y = y) = p^{y} (1 - p)^{1 - y}\), for \(y = 0, 1\). Example \(9.6\) shows that
\(L(y_1,y_2,\dotsc,y_n \mid p) = p^{\sum_{i=1}^{n}y_i} (1 - p)^{n - \sum_{i=1}^{n}y_i}\). Therefore,
\begin{align*}
\frac{L(x_1,x_2,\dotsc,x_n \mid p)}{L(y_1,y_2,\dotsc,y_n \mid p)}
&= p^{\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i} (1 - p)^{\sum_{i=1}^{n}y_i - \sum_{i=1}^{n}x_i} \\
&= \left(\frac{p}{1 - p}\right)^{\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i}.
\end{align*}
Differentiation with respect to \(p\) shows that this function is independent of \(p\) if and only if the exponent,
\(\sum_{i=1}^{n}x_i - \sum_{i=1}^{n}y_i\), is zero.

Letting \(g(x_1,x_2,\dotsc,x_n) = \sum_{i=1}^{n} x_i\), the Lehmann-Scheff&eacute; method
shows that
\[
U = g(Y_1,Y_2,\dotsc,Y_n) = \sum_{i=1}^{n} Y_i
\]
is a minimal sufficient statistic for \(p\). This is the same statistic we found in example
\(9.6\).
</li>
<li>
Let \(Y_1,\dotsc,Y_n\) be a random sample from the Weibull density in example
\(9.7\). Then 
\[
L(y_1,\dotsc,y_n \mid \theta) = \left(\frac{2}{\theta}\right)^{n}
\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} y_i^2\right) \prod_{i=1}^{n}y_i.
\]
Therefore,
\begin{align*}
\frac{L(x_1,\dotsc,x_n \mid \theta)}{L(y_1,\dotsc,y_n \mid \theta)}
&= \frac{\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} x_i^2\right)}
{\exp\left(-\frac{1}{\theta}\sum_{i=1}^{n} y_i^2\right)}
\cdot
\frac{\prod_{i=1}^{n}x_i}{\prod_{i=1}^{n}y_i} \\
&=
\left(\frac{\prod_{i=1}^{n}x_i}{\prod_{i=1}^{n}y_i}\right)
\exp\left[-\frac{1}{\theta}
\left(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2\right)\right].
\end{align*}
Differentiation with respect to \(\theta\) shows that the ratio
is independent of \(\theta\) if and only if
\(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2 = 0\).

Letting \(g(x_1,\dotsc,x_n) = \sum_{i=1}^{n} x_i^2\), the Lehmann-Scheff&eacute; method
shows that
\[
U = g(Y_1,\dotsc,Y_n) = \sum_{i=1}^{n} Y_i^2
\]
is a minimal sufficient statistic for \(\theta\).
</li>
</ol>

##Problem 9.59 (2 points)
Let \(Y_1,\dotsc,Y_n\) denote a sample from a normal population with
mean \(\mu\) and variance \(\sigma^2\). Example \(9.8\) (page \(438\))
shows
\[
L(y_1,\dotsc,y_n \mid \mu, \sigma^2) =
\left(2\pi\sigma^2\right)^{-n/2}\cdot
\exp\left(-\frac{n \mu^2}{2\sigma^2}\right)\cdot
\exp\left[-\frac{1}{2\sigma^2}
\left(\sum_{i=1}^{n} y_i^2 - 2\mu\sum_{i=1}^{n} y_i\right)
\right],
\]
which implies that
\[
\frac{L(x_1,\dotsc,x_n \mid \mu, \sigma^2)}{L(y_1,\dotsc,y_n \mid \mu, \sigma^2)} =
\exp\left[
-\frac{1}{2\sigma^2}\left(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2\right)
\right]\cdot
\exp\left[
\frac{\mu}{\sigma^2}\left(\sum_{i=1}^{n} x_i - \sum_{i=1}^{n} y_i\right)
\right].
\]
A calculation shows that the partial derivatives of this ratio with respect to
\(\mu\) and \(\sigma^2\) are both zero if and only if 
\(\sum_{i=1}^{n} x_i - \sum_{i=1}^{n} y_i = 0\) and 
\(\sum_{i=1}^{n} x_i^2 - \sum_{i=1}^{n} y_i^2 = 0\). Hence, we see that
the ratio is independent of \(\mu\) and \(\sigma^2\) if and only if
\(\sum_{i=1}^{n} x_i = \sum_{i=1}^{n} y_i\) and
\(\sum_{i=1}^{n} x_i^2 = \sum_{i=1}^{n} y_i^2\), therefore the
Lehmann-Scheff&eacute; methods shows that
\[
\sum_{i=1}^{n} Y_i,\quad \sum_{i=1}^{n} Y_i^2
\]
are jointly a minimal sufficient statistic for \(\mu\) and \(\theta\).

##Problem 9.60 (2 points)
Suppose that \(U\) is a minimal sufficient statistic for \(\theta\), and
\(g_1(U)\) and \(g_2(U)\) are both unbiased estimators for \(\theta\).
Suppose that \(f_U(u \mid \theta)\) is a complete family of density functions,
and suppose that \(g_1(u)\) and \(g_2(u)\) are continuous. Then 
\((g_1 - g_2)(u)\) is continuous, and for all \(\theta\)
\begin{align*}
E[(g_1 - g_2)(U)] &= 
E[g_1(U) - g_2(U)] \\
&= E[g_1(U)] - E[g_2(U)] \\
&= \theta - \theta \\
&= 0.
\end{align*}
Now, completeness implies \((g_1 - g_2)(u) = 0\) for all \(u\),
which implies that \(g_1(u) = g_2(u)\) for all \(u\) and therefore
that \(g_1(U) = g_2(U)\). Since \(U\) is minimal, the Rao-Blackwell
theorem implies that \(g_1(U)\) and \(g_2(U)\) are MVUE. So the 
additional property of completeness implies that the MVUE is unique.










