PSTAT 120C: Data analytic report 1
Due: Nov 2, 2021 before class
Please submit your report as a pdf, word or image file. Please submit your $\mathrm{R}$ code in separate file(s). Please attach figures from R to illustrate your answers.

Problem 1.

1. (10 points) For the presidential poll in 2016, explore the poll in Michigan, Georgia and North Carolina from August 1, 2016 to November 2 in 2016 . Use the data to answer the following questions.
a. Who is ahead in each of these three states? What is the percentage difference for each state?
b. Run a paired $\mathrm{t}$ test of the counts in polls for each of the state. Who is in favor of winning based on the test? Is the test significant? Is there potential problem?
c. Run a Wilcoxon signed-rank test of the counts in polls for each of the state. Who is in favor of winning based on the test? Is the test significant? Is there potential problem of the test?
d. Fit a linear model of the percentage difference with respect to date of the polls separately for each of these states. Show a plot of the observations of the polls, fitted values and confidence interval of the fitted line for each of these state. From the linear model and observations, which state may have the closest election (in terms of percentage difference)?
e. From the real results of 2016 election, which state has the smallest margin (in terms of percentage difference)? Discuss at least two reasons that are different than what polls indicate. (You may check Wikipedia for 2016 US presidential election to find out the real voting results for each state.)
f. Do polls correctly predict the candidate who wins these states? Discuss the bias of polls in these states. Name a few possible reasons.

Problem 2.

2. (10 points) Redo Question 1 (a)-(f) for the same three states for the presidential polls in from August 1 to November 2 in $\mathbf{2 0 2 0}$. (You may check Wikipedia for 2020 US presidential election to find out the real voting results for each state.)

Problem 3.

(10 points) Use data to explore states may change their electoral votes to another candidate from a different party and answer the following questions.
a. Use figures or tables to compare the state level polls in 2016 and 2020 .
b. Draw your conclusion and name 5 states that may change their electoral votes in 2020 .
c. Are these 5 states Arizona, Georgia, Michigan, Pennsylvania and Wisconsin (which elected another candidate from a different party)? If not, please give your reasons. If so, based on the polls, name one or two other states that may elect another candidate from a different party in 2020 as well but did not happen in reality. Explain the reason.

