Problem 1.

(30 marks) (Adapted from Box, Hunter and Hunter) Ten hens were randomly allocated to two different diets, $A$ or $B$. Five were assigned to Diet $A$ and five to Diet $B$. After one year the number of eggs produced are given in the table below. The hens are identified by the number in parentheses.
\begin{tabular}{cccccc}
\hline Diet A & 166(1) & 174(2) & 150(3) & 166(4) & 165(5) \\
\hline Diet B & 158(6) & 159(7) & 142(8) & 163(9) & 161(10) \\
\hline
\end{tabular}
The randomization distribution and related output for testing $H_{0}: \mu_{A}=\mu_{B}$ versus $H_{1}: \mu_{B}<\mu_{A}$ (there was a typo here!), where $\mu_{A}, \mu_{B}$ are the mean number of eggs for diets $A$ and $B$ respectively, were calculated. Some of the $\mathrm{R}$ code is shown below.

yA <- c(166,174,150,166,165)
yB <- c(158,159,142,163,161)
eggs <- c(yA,yB) #pool data
for (i in 1:N)
{
res[i] <- mean(eggs[index[,i]])-mean(eggs[-index[,i]])
}
hist(res,xlab=”meanA-meanB”, main=”Randomization Distribution of difference in means”)
observed <- mean(yA)-mean(yB) #store observed mean difference
abline(v=observed) #add line at observed mean diff

Answer the following questions based on this study.
(a) (5 marks) Is this study an experiment or observational study? Briefly explain your reasoning.

(b) (5 marks) Give two examples of possible treatment assignment for this study under the null hypothesis that are not the observed treatment assignment. Use the table below to fill in the treatment assigned to each hen. Use A for diet A and B for diet B.

(c) (5 marks) What is the propensity score in this study? What is the probability of treatment assignment? Is the treatment assignment ignorable? Briefly explain.

(d) (5 marks) What is the p-value of the randomization test? Is there evidence at the $5 \%$ significance level that diet $A$ is better than diet $B ?$ Briefly explain your reasoning.

(e) (10 marks) Suppose another investigator would used a different set of 10 hens to compare the 2 diets, but would like to use a randomized paired design instead of the design described above. In one or two sentences describe how this study could have been designed as a randomized paired design. What are the treatments and experimental units? How would you randomize the treatments to the units? What is the propensity score and probability of treatment assignment in your paired design?

Proof .

(a)Experiment. The treatment assignment mechanism is known. In other words, the
mechanism of how treatments were assigned to units is known.

(b)Any two treatment assignment that assigns 5 hens to diet A and 5 hens to diet B,
besides the observed assignment.

(c)The propensity score is the probability of a hen is assigned, say, diet $\mathrm{A}=5 / 10$.
The probability of a treatment assignment is $\frac{1}{\left(\begin{array}{c}10 \\ 5\end{array}\right)}=\frac{1}{\left(\begin{array}{c}10 \\ 5\end{array}\right)}=\frac{1}{252}=0.003968$.
The treatment assignment is ignorable since diet was assigned randomly and the hens’ eggs production should be independent of the diet assigned.

(d)The p-value is $30 / 252=0.119 .$ The p-value $>0.05$, therefore there is no evidence of a difference in egg production between the two diets $\mathrm{A}$ and $\mathrm{B}$, at the $5 \%$ level.

(e)The experimental units are the hens and the treatment is diet.
AND
Here are two possible ways to obtain a randomized paired design:
Answer 1 Pair the hens by a common covariate. Within each pair, randomly assign diets A and $\mathrm{B}$. Here the propensity score is $1 / 2$ and the probability of treatment assignment is $\frac{1}{2^{5}}$.
OR Answer 2 Randomly assign one of the two diets to each hen, say for 12 months. After a wash-out period, offer the other diet to the same hen for another 12 months. For example, if a fair coin is tossed shows heads, then hen 1 is assigned to diet $A$ for the first 12 months, then does diet $\mathrm{B}$ for 12 month after the wash-out period. Here the propensity score is $1 / 2$ and the probability of treatment assignment is $\frac{1}{2^{10}}$.

Problem 2.

(Adapted from Box, Hunter and Hunter) Paint used for marking lanes on highways must be very durable. Double yellow lines are used to mark the centre of a road to separate lanes of traffic travelling in opposite directions. In one trial, yellow paint from two suppliers, labelled $A$ and $B$, was randomly assigned to the two centre lines on six different highway sites, denoted $1,2,3,4,5,6 .$ After a considerable length of time, the average wear for the samples at the six sites were as follows:
\begin{tabular}{ccc}
\hline Site & Paint $A$ & Paint $B$ \\
\hline 1 & 59 & 69 \\
2 & 65 & 83 \\
3 & 64 & 74 \\
4 & 52 & 61 \\
5 & 71 & 78 \\
6 & 64 & 69 \\
\hline
\end{tabular}

paintA <- c(59,65,64,52,71,64)
paintB <- c(69,83,74,61,78,69)
#Two-sample t-test; equal variance
t.test (paintA, paintB, paired=FALSE, var.equal=TRUE)
##
## Two Sample t-test
##
## data: paintA and paintB
## t = -2.3971, df = 10, p-value = 0.0375
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.9735320 -0.6931347
## sample estimates:
## mean of x mean of y
## 62.50000 72.33333
#Two-sample t-test; unequal variance
t.test (paintA, paintB, paired=FALSE, var.equal=FALSE)
##
## Welch Two Sample t-test
##
## data: paintA and paintB
## t = -2.3971, df = 9.6661, p-value = 0.0383
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -19.0165158 -0.6501509
## sample estimates:
## mean of x mean of y
## 62.50000 72.33333
5
#Paired t-test; equal variance
t.test (paintA, paintB, paired=TRUE)
##
## Paired t-test
##
## data: paintA and paintB
## t = -5.4176, df = 5, p-value = 0.002901
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -14.499095 -5.167572
## sample estimates:
## mean of the differences
## -9.833333

(a) (5 marks) What is the name of the study design? Justify your answer.

(b) (5 marks) Is this study an experiment or an observational study? Justify your answer.

(c) (5 marks) Is there a statistically significant difference between the mean wear of A and B at the

(d) (5 marks) Would it have been more appropriate to conduct a randomization test for this data?

Proof .

(a) Randomized paired design. Paint was randomized within each site.

(b) An experiment since the treatment assignment mechanism was under the control of the researcher and the probability of treatment allocation was known before the experiment begun.

(c) Yes, since the p-value of 0.003 is less than 0.01.

(d)
The Q-Q plot for the differences does not deviate too much from the straight line. Hence, the normality assumption is satisfied for the paired t-test. Hence, the paired t-test was appropriate for the data. A randomization test is an exact test that does not require the assumption of normality or independence. A randomization test would have produced similar results.

Problem 3.

3. ( 25 marks) A psychologist studying body language conducted an experiment on 20 subjects, and obtained a significant result from a two-sided z-test $\left(H_{0}: \mu=0\right.$ vs. $\left.H_{1}: \mu \neq 0\right) .$ Let’s call this experiment #1. The observed value of the z statistic from your experiment is $z=2.2$ so the p-value $=0.028 .$ In order to confirm the results the psychologist is planning to run the same experiment on an additional 10 subjects (i.e., the same experiment will be done on 10 different subjects). Let’s call this experiment $\# 2$.
The following percentiles from the $N(0,1)$ distribution might be required to carry out some of the calculations in the questions below.
$$\begin{array}{ll} \hline 0.468 & 0.08 \\ 0.100 & 1.28 \end{array}$$
$\begin{array}{ll}0.050 & 1.64\end{array}$ $0.025 \quad 1.96$
$$\begin{array}{ll} 0.020 & 2.05 \\ 0.010 & 2.33 \end{array}$$
$z_{\alpha}$ is the $100(1-\alpha)^{t h}$ percentile of the $N(0,1)$. For example, the $90^{t h}$ percentile is $z_{0.10}=1.28$.
(a) (5 marks) Assume that the true mean in experiment #2 is the sample mean obtained in the experiment $\# 1$. What is the probability that the results of experiment #2 will be significant at the $5 \%$ level by a one-tailed $z$ -test $\left(H_{1}: \mu>0\right) ?$ Provide a brief interpretation of this probability.

(b) (10 marks) The psychologist strongly believes in her theory, and would like a sample size formula for experiment #2 so she can calculate the sample size given $\alpha$ – type I error rate and $\beta$ – type II error rate. Derive such a sample size formula for the psychologist as a function of $\alpha, \beta$ and $\Phi(\cdot)$ the cumulative distribution function of the $N(0,1)$.

(c) (5 marks) The power function derived in part (b) was used to create a plot of sample size $n$ versus $\beta$, the probability of a type II error.

What does the plot tell you about the relationship between sample size and power for experiment $\# 2 ?$ Use the plot to estimate how many subjects the psychologist would have to enrol so that experiment #2 will have $80 \%$ power at the $5 \%$ significance level. Should she revise her original design and enrol more than 10 subjects in experiment $\# 2,$ if she want to be more confident of rejecting $H_{0}$ when in fact $\mu>0 ?$

(d) (5 marks) Suppose the psychologist decided to change the significance level in experiment #2 from $5 \%$ to $10 \%$. The plot of $n$ versus $\beta$ is shown for three values of $\alpha,$ the type I error rate, but the statistician that created the graph forgot to label two of the three curves in the plot. Should she use the curve above or below the curve where $\alpha=0.05$ to estimate the sample size? Estimate the sample size required for experiment #2 to have $80 \%$ power at the $10 \%$ significance level. Briefly explain.

Proof .

(a)Assume $\mu=2.2 \sigma / \sqrt{20}$.
Experiment $\# 2$ rejects when $\frac{x}{\sigma \sqrt{10}} \geq 1.64$ or $\bar{x} \geq 1.64 \frac{\sigma}{\sqrt{10}} .$ Now standardize
$$P\left(\bar{x} \geq 1.64 \frac{\sigma}{\sqrt{10}}\right)$$
by subtracting $\mu$ and dividing by $\sigma / \sqrt{10}$. This gives,
\begin{aligned} P\left(\bar{x} \geq 1.64 \frac{\sigma}{\sqrt{10}}\right) &=P(Z \geq 1.64-2.2 \sqrt{10 / 20}) \\ &=P(Z \geq 0.08) \\ &=0.468 \end{aligned}
This means that the experiment has about $47 \%$ power.

(b)The test rejects when $\bar{x} \geq z_{\alpha} \frac{\sigma}{\sqrt{10}}$. Assuming $\mu=2.2 \sigma / \sqrt{20}$, the power of the test, $1-\beta$, can be calculated as
\begin{aligned} 1-\beta &=P\left(\bar{x} \geq z_{\alpha} \frac{\sigma}{\sqrt{10}} \text { when } \mu=\frac{2.2 \sigma}{\sqrt{20}}\right) \\ &=1-P\left(Z<z_{\alpha}-2.2 \sqrt{n / 20}\right) \\ & \Rightarrow \Phi^{-1}(\beta)=z_{\alpha}-2.2 \sqrt{n / 20} \\ & \Rightarrow n=\left[\frac{\left(z_{\alpha}-\Phi^{-1}(\beta)\right) \sqrt{20}}{2.2}\right]^{2} \end{aligned}

(c)The plot shows that as $\beta$ decreases $n$ increases which implies that power increases as $n$ increases.
From the plot $n$ is approximately 25 .
The psychologist should revise her original sample size of 10 to at least 25 since this would give her approximately $80 \%$ power.

(d)Using the power curve below $\alpha=0.05$ curve we have that $n$ is approximately 20. The power curve below is used since as $\alpha$ increases $n$ should decrease for a fixed value of $\beta$.

E-mail: [email protected]  微信:shuxuejun

uprivate™是一个服务全球中国留学生的专业代写公司