这是一份NUSW的R project作业代写的成功案例

Stats Central – Online Short Course: Intro to R, May 2-3 2022

Facilities Stats Central

2022 May 2-3

Course Overview

R is widely used and extremely powerful statistical software. This course assumes that you have never used R before. You will learn how to obtain and install R, which is open-source software, and RStudio, which is a versatile, user-friendly interface for using R. It is very useful to do this course before our introductory statistics course, Introductory Statistics for Researchers.

This course will be held over two half-days and will cover some basic features of R and lay the groundwork for you to improve your R skills independently. The course is self-paced and focused on developing practical skills.

Course Outline

This course will cover topics including:

  • Basics of interacting with R – calculations, saving variables so you can reuse them, data types and structures, organising R code in scripts
  • Tidyverse – a basic introduction to tidy R code
  • Data – reading in and organising data (from spreadsheets) with dplyr
  • Plotting – make beautiful figures with ggplot

Course Requirement: You will need a computer with administrator access (to install R and RStudio software before attending the course).

*This is a popular course and tickets are limited!*

Date: Monday 2 and Tuesday 3 May, 9.30am -1.00pm each day

Location: Online

You will receive a certificate of completion for the course.

r语言代写|Coding in R for Data assignment2 NYU

1. Assignment

2. Contents

  1. Compute sample size

1

  1. SiO2 analysis
  2. Automatic identification with ATD

a) create a dataframe with ALD

b) Boxplot for the data

$\mathrm{~ c ) ~ H i s t o g r a m ~ B o x p l o t ~ f o r ~ t h e ~ d a t a ~ . . . . . . . ~ . ~ . ~ .}$

d) Density and comparation $\quad 4 \cdot \frac{3 \cdot 2 \cdot 2 \cdot}{}$

e) Hypothesis test $\ldots \ldots . \ldots . \ldots \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots$

$\mathrm{~ f ) ~ T w o – s i d e d ~ C I ~ f o r ~ a v e r a g e ~ A L D ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .}$

  1. Time needed to repair a rail break

a) probability plot $\ldots . . . \ldots \ldots$

b) Hypothesis test $\ldots \ldots . \ldots . \ldots . \ldots . \ldots \ldots \ldots \ldots . \ldots . \ldots . \ldots$

c) type 2 error $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$

  1. Nickel plates for test cells 7

a) Hypothesis test $\ldots . . . \ldots . . . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots .$

b) Calculation of sample sizes $\ldots . . . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots . \ldots .$

c) Sample size $\ldots . . \ldots \ldots . . \ldots \ldots . . \ldots . \ldots . \ldots . \ldots$

3. Compute sample size

4. SiO2 analysis

  • Null hypothesis:
  • Percentage of $\mathrm{SiO} 2$ in a certain type of aluminous cement is $5.5$
  • Alternative hypothesis:
  • Percentage of $\mathrm{SiO} 2$ in a certain type of aluminous cement is larger than $5.5$

sio2_sample $=$ sample_size(alpha $=0.01$, beta $=0.01$, delta $=5.6-5.5$, sigma $=0.3$ )

sio2_sample [1]

## [1] 195

5. Automatic identification with ALD

6. a) create a dataframe

7. b) Boxplot for the data

8. Boxplot for ALD

From the limited information revealed from the boxplot, we can assume that the ALD is normally distributed.

9. Histogram for ALD

6 –

The histogram shows that the assumption of the normal distribution cannot be proved.

10. d) Density and comparation

ggplot (data = as.vector(data) , aes (sample $=\mathrm{X} .$ ALD. $))+$ geom_qq $($ col $=$ “blue” $)+$ geom_qq_line(col $=$ “red”) $+$ ” Normal probability plot for ALD

The Q-Q

plot is plausible enough to prove that ALD is normally distributed. However, the assumption of normality is not so necessary for calculating CI and testing hypotheses about the true average ALD due to the large number of samples (49). We can use the Large Number theory to solve the above two problems.

11. e) Hypothesis test

  • Null hypothesis
  • ALD is equal to 1.0.
  • Alternative hypothesis
  • ALD is less than 1.0.

t.test $($ data, mu $=1 \cdot 0$, alternative $=$ “less” $)$

#

## One Sample t-test

##

## data: data

$# # t=-5.7905, d f=48, \mathrm{p}$-value $=2.615 \mathrm{e}-07$

## alternative hypothesis: true mean is less than 1

## 95 percent confidence interval:

$-\operatorname{Inf} 0.8222677$

## sample estimates:

$# #$ mean of $x$

## $0.7497959$

The p-value is $2.615 \mathrm{e}-07$, much smaller than $0.01$, so we reject the null hypothesis, that is, the data does provide strong evidence for concluding that the true average ALD is less than 1.0.

12. f) Two-sided CI for average ALD

t.test (data, mu $=1.0$, alternative $=$ “less”, conf .level $=0.95)$

##

# One Sample t-test

\

data: data

$# # t=-5.7905$, df $=48$, p-value $=2.615 \mathrm{e}-07$

# alternative hypothesis: true mean is less than 1

## 95 percent confidence interval:

$\quad-\operatorname{Inf} 0.8222677$

## sample estimates:

mean of $\mathrm{x}$

## 0.7497959

13. Time needed to repair a rail break

a) probability plot

data4 $=$ data.frame $(\mathrm{x}=\mathrm{c}(159,120,480,149,270,547,340,43,228,202,240,218))$

ggplot (data $=$ data4, aes $(\operatorname{sample}=\mathrm{x}))+$ geom_qq $(\mathrm{col}=$ “blue” $)+$ geom_qq_line(col $=$ “red”) $+$ labs $(t i t l e=11$

Normal probability plot for time for repair

$500-$

$400-$

$\stackrel{(}{E} 300-$

$200-$

$100-$

Theoretical quantiles

We can conclude from the Q-Q plot that it is plausible that the repair time is normally distributed except some outliers.

14. b) Hypothesis test

  • Null hypothesis
  • Repair time is equal to $200 \mathrm{~min}$.
  • Alternative hypothesis
  • Repair time is more than $200 \mathrm{~min}$.

t.test (data4, mu = 200, alternative $=$ “greater”, conf. level $=0.95)$

##

One Sample t-test

##

data: data4

$# # t=1.1853, d f=11, p$-value $=0.1304$

## alternative hypothesis: true mean is greater than 200

## 95 percent confidence interval:

$174.4174$ Inf

## sample estimates:

mean of $\mathrm{x}$

$249.6667$

p-value is equal to $0.13$, larger than $0.05$, so we can accept the null hypothesis – there is no compelling evidence showing that the repair time exceeds $200 \mathrm{~min}$.

c) type 2 error

power.t.test $(\mathrm{n}=\mathrm{nrow}($ data4 $)$, delta $=100, \mathrm{sd}=150, \mathrm{sig} \cdot$ level $=0.05$, type $=$ “one.sample”, alternative $=$ “ol

##

## One-sample t test power calculation

##

$# # \quad \mathrm{n}=12$

$# # \quad$ delta $=100$

sd $=150$

sig.level $=0.05$

power $=0.6981908$

alternative $=$ one. sided

From the result we can compute that the type 2 error probability of the test used in a. is 1 – $0.698=0.302$.

15. Nickel plates for test cells

16. a) Hypothesis test

We can view this problem as a binominal distributed data. But the number of the sample times p0, that is $100 * 0.1$, is equal to 10 , so we can use the large samples test .

  • Null hypothesis
  • The blister probability is equal to 0.1.
  • Alternative hypothesis
  • The blister probability is more than 0.1. prop.test $(\mathrm{x}=14, \mathrm{n}=100, \mathrm{p}=0.1$, alternative $=$ “greater”, conf. level $=0.95$, correct $=$ FALSE $)$

#

#

## 1-sample proportions test without continuity correction

$# #$

## data: 14 out of 100 , null probability $0.1$

$# #$-squared $=1.7778, \quad d f=1$, p-value $=0.09121$

## alternative hypothesis: true $p$ is greater than $0.1$

95 percent confidence interval:

## $0.09237298 \quad 1.00000000$

## sample estimates:

$\mathrm{p}$

## $0.14$

For certain significance level of $0.05$, I also may commit the type 2 error.

b) Calculation of sample sizes

sample_size2 <- function (alpha, beta, p0,p) {

z.alpha = qnorm(alpha, lower.tail = FALSE)

z.alpha2 = qnorm (alpha/2, lower.tail = FALSE)

z.beta = qnorm (beta, lower.tail = FALSE)

$\mathrm{n} 1=$ round $(($ ( $. \mathrm{alpha} * \operatorname{sqrt}(\mathrm{p} 0 *(1-\mathrm{p} 0))+\mathrm{z} \cdot \mathrm{beta} * \operatorname{sqrt}(\mathrm{p} *(1-\mathrm{p}))) /(\mathrm{p} 0-\mathrm{p}))$ ) 2 , digits=0) #for ane-tail

$\mathrm{n} 2=$ round $((\mathrm{z} \cdot \mathrm{alpha} 2 * \operatorname{sqrt}(\mathrm{p} 0 (1-\mathrm{p} 0))+\mathrm{z} \cdot$ beta $ \operatorname{sqrt}(\mathrm{p} *(1-\mathrm{p}))) /(\mathrm{p} 0-\mathrm{p}))$ ) 2 , digits=0) #for a two-tail

$\operatorname{return}(c(n 1, n 2))$

}

Here $n 1$ is the apprpriate answer for the hypothesis test illustrated above.

c) Sample size

paste(“Sample size of the plates for the test is”, sample_size2(alpha $=0.05$, beta $=0.10$, po $=0.1$, p $=0.1$

## [1] “Sample size of the plates for the test is $362 “$

r语言代写|Coding in R for Data assignment2 NYU UprivateTA™

matlab代写请认准UprivateTA™. UprivateTA™为您的留学生涯保驾护航。

实分析代考

图论代考

运筹学代考

模电数电代写

神经网络代写

数学建模代考