## Setup

Data and sample are synonymous
– We assume independent and identically distributed (IID) sample
– Sample of observations drawn independently from the same distribution
– Random sample of observations from the same distribution
$\left(Y_{i}, X_{i}\right)$ represents $(Y, X)$ for the $i^{\text {th }}$ position in the sample
– IID: $\left(Y_{i}, X_{i}\right)$ are independent across $i$ and have the same distribution for all $i$
The $i^{\text {th }}$ position in the sample is typically called “agent $i$ “

## Our Tools

Pair of random variables $(Y, X)$ is characterized by joint probability distribution $\operatorname{Pr}(Y=y, X=x)$
– From joint distribution, can obtain marginal distribution of $Y$ and $X$
– Sum $\operatorname{Pr}(Y=y, X=x)$ across $x$ to obtain $\operatorname{Pr}(Y=y)$
– Sum $\operatorname{Pr}(Y=y, X=x)$ across $y$ to obtain $\operatorname{Pr}(X=x)$
– Can also obtain conditional distribution of $Y$ given $X$ : $\operatorname{Pr}(Y=y \mid X=x)$
– Bayes’ rule
– Interpretation of conditional distribution

Mean
$$E[Y]=\sum_{y} \underbrace{\operatorname{Pr}(Y=y)}_{\text {weight }} y$$
$E[Y]$ is the “best” predictor of $Y$
Sample estimator of $E[Y]:$ Sample average
$$\bar{Y}=\sum_{i=1}^{n} \underbrace{\frac{1}{n}}_{\text {weight }} Y_{i}=\frac{1}{n} \sum_{i=1}^{n} Y_{i}$$
Sample average gives equal weight to each observation $Y_{i}$

Conditional mean
$$E[Y \mid X=x]=\sum_{y} \underbrace{\operatorname{Pr}(Y=y \mid X=x)}_{\text {weight }} y$$
$E[Y \mid X=x]$ is the “best” predictor of $Y$ as a function of $x$ Interpretation of conditional mean
– $E[Y \mid X=x]$ is $E[Y]$ only for those that satisfy $X=x$ Interpretation of conditional mean with more than 1 condition: e.g., $E[Y \mid X=x, Z=z]$
Sample estimator of $E[Y \mid X=x] ?$
– Sample average only for those that satisfy $X=x$

Covariance
$\operatorname{cov}(Y, X)=E[(Y-E[Y])(X-E[X])]$
$$\operatorname{cov}(Y, X)>0: Y \text { and } X \text { move in the same direction on }$$
average
$$\operatorname{cov}(Y, X)<0: Y \text { and } X \text { move in the opposite direction }$$
on average
$\operatorname{var}(Y)$ is $\operatorname{cov}(Y, Y)$
$\operatorname{sd}(Y)$ is $\sqrt{\operatorname{var}(Y)}$
$\operatorname{corr}(Y, X)=\frac{\operatorname{cov}(Y, X)}{s d(Y) s d(X)} ;$ corr is always between $-1$ and 1

# With Binary X

“Treated” v.s. “Non-treated”
Unlike lab experiments, difficult to control for “everything else”
Economics typically deals with observational data Observational data: Agents can “select” what they do

Why did they “select” to be “treated’?
– A related example is adverse “selection”
– High-risk people “select” generous insurance policy
The “treated” and the “non-treated” may be inherently different
Difference in means capture “treatment effect” plus something else
“Difference in differences” might capture the “treatment effect’

Linear regression model
$$Y=\alpha_{0}+\beta_{0} X+U$$
Describes how we think $Y$ is generated
Interpretation of error term $U$
$X$ and $U$ are not necessarily independent
– Generalization of “the “treated” and the “non-treated” may be inherently different”
Interpretation of $\beta_{0}$

For binary $X$, CEF $E[Y \mid X=x]$ is always linear in $x$ For general $X$, form of $E[Y \mid X=x]$ unknown
$$\begin{gathered} E[Y \mid X=x] \\ =E\left[\alpha_{0}+\beta_{0} X+U \mid X=x\right] \\ =\alpha_{0}+\beta_{0} x+\underbrace{E[U \mid X=x]}_{=?} \end{gathered}$$

BS equation代写

Categories: 计量经济学