这是一份线性回归的作业代写成功案例

数学代写|math 450 assignment 3 linear regression

Problem 1.

P3.1 (10 pts) Generate $N=200$ samples of the input variable $X \sim \mathcal{U}(-1,1)$ and corresponding samples of $Y$ following the rule
$$
Y=X\left(1-X^{2}\right)+\varepsilon, \quad \varepsilon \sim \mathcal{N}(0,0.01)
$$
Note, you can use the NumPy routines to generate samples $Z \sim \mathcal{N}\left(\mu, \sigma^{2}\right)$ by running
$$
\begin{aligned}
&\mathrm{N}=200 \
&\mathrm{mu}=0.0 \
&\mathrm{sigma}=0.5 \
&\mathrm{z}=\mathrm{sigma} * \mathrm{np} \cdot \text { random } \operatorname{randn}(\mathrm{N})+\mathrm{mu}
\end{aligned}
$$
and similarly generate samples $Z \sim \mathcal{U}(-1,1)$ using the following example
$$
\begin{aligned}
&N=200 \
&u l=-1 \cdot 0 \
&u r=1.0 \
&z=(u r-u l) * n p \cdot r \text { andom } \cdot r \text { and }(N)+u l
\end{aligned}
$$
Taking these samples $\left(\mathfrak{x}{i}, \mathfrak{y}{i}\right)$ for $i=1, \ldots, N$ use the feature variables
$$
h_{1}(X)=X, \quad h_{2}(X)=X^{2}, \quad h_{3}(X)=X^{3}
$$
fit a linear model
$$
f(X)=\beta_{0}+\sum_{j=1}^{3} \beta_{j} h_{j}(X)
$$
by finding the least squares estimate $\hat{\beta}$.

Problem 2.

P3.2 ( 6 pts) Generate $M=100$ sets of $N=200$ samples for input variable $X \sim$ $\mathcal{U}(-1,1)$
$$
Y=X+\varepsilon, \quad \varepsilon \sim \mathcal{N}(0,0.01)
$$
Then fit a linear model $f(X)=\beta_{0}+\beta_{1} X$ for each set by finding the least squares estimates $\hat{\beta}$ for each sample set. Estimate the mean and the variance
$$
\mathbb{E}[\hat{\beta}]=\frac{1}{M} \sum_{i=1}^{M} \hat{\beta}^{(i)}, \quad \operatorname{Var}[\hat{\beta}]=\mathbb{E}\left[|\hat{\beta}-\mathbb{E}[\hat{\beta}]|^{2}\right]
$$

Problem 3.

P3.3 (14 pts) We will consider a data set with a lot of random features that contain no information about the output variable. That is,
$$
\begin{aligned}
Y &=X_{j}+\varepsilon \quad \text { for } j=1, \ldots, k, \
X_{j} &=\varepsilon \quad \text { for } j=k+1, \ldots, p,
\end{aligned}
$$
where $X_{j} \sim \mathcal{N}(0,1)$ and $\varepsilon \sim \mathcal{N}(0,1)$.
That is, first $k$ variables of $X$ contain information about $Y$, but not the later ones. We will construct a ridge regression model for these sets of features.
(a) Write a Python routine generating the data set $\left(\mathbf{X}{\text {train }}, \mathbf{y}{\text {train }}\right)$ for size $N_{\text {train }}=200$ with number of feature variables $p=40$, number of relevant feature variables $k=20$.
(b) Compute a ridge regression model by estimating $\hat{\beta}^{\text {ridge }}$ with regularization parameter $\lambda=1 / 20$.
(c) Compute the test error of the regression model by generating the test set of size $N_{\text {test }}=20$ and computing the MSE for the test set
$$
\mathrm{MSE}{\text {test }}=\frac{1}{\sqrt{N{\text {test }}}}\left|\mathbf{X}{\text {test }} \hat{\beta}^{\text {ridge }}-\mathbf{y}{\text {test }}\right|_{\ell^{2}}
$$
(d) Compute the effective degrees of freedom of the inverse of $\mathbf{X}{\text {train }}^{T} \mathbf{X}{\text {train }}$ given by
$$
\operatorname{df}(\lambda)=\sum_{j=1}^{p} \frac{\sigma_{j}^{2}}{\sigma_{j}^{2}+1 / \lambda}
$$
where $\sigma_{j}$ is the singular value of the matrix $\left(\mathbf{X}{\text {train }}^{T} \mathbf{X}{\text {train }}\right)^{-1}$.
(e) Compute the MSE for the training set, the MSE for the test set, and the effective degree of freedom (1.8) above, for varying total number of features $p=40,50,60, \ldots 400$. Create a plot of these values as a function of $p$.