## Introducing bivariate normal distribution

18 12 2018

This post is an introduction of the bivariate normal distribution.

Bivariate Normal Distribution

Consider the following probability density function (pdf):

.
(1)……..$\displaystyle f(x,y)=\frac{1}{2 \pi \ \sigma_X \ \sigma_Y \ \sqrt{1-\rho^2}} \ e^{-\frac{1}{2} \ W}$

……..
where
……..
$\displaystyle W=\frac{1}{1-\rho^2} \ \biggl[\biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2-2 \rho \biggl(\frac{x-\mu_X}{\sigma_X} \biggr) \biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr) +\biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr)^2\biggr]$
.

for all $-\infty and $-\infty.

If the joint distribution of the random variables $X$ and $Y$ is described by the probability density function (1), $X$ and $Y$ are said to have the bivariate normal distribution with parameters $\mu_X$, $\sigma_X$, $\mu_Y$, $\sigma_Y$ and $\rho$.

The above definition alone does not provide much insight about the bivariate normal distribution. At this point, it is not clear whether (1) is actually a valid pdf. It says nothing about the roles played by the five parameters. Assuming the joint density function (1), after digging a little deeper, facts about the conditional distribution of $Y$ given $X=x$ emerge and are summarized by the following theorem.

……..

Theorem 1
Suppose that $X$ and $Y$ have the bivariate normal distribution as defined by the pdf (1). Then the following properties hold.

• The marginal distribution of the random variable $X$ is a normal distribution with mean $\mu_X$ and variance $\sigma_X^2$.
• The conditional distribution of $Y$ conditioning on $X=x$ is a normal distribution.
• The mean of the conditional distribution of $Y$ conditioning on $X=x$ is
.
……..$\displaystyle E[Y \lvert X=x]=\mu_Y+\rho \frac{\sigma_Y}{\sigma_X} (x-\mu_X)$.
• The variance of the conditional distribution of $Y$ conditioning on $X=x$ is
.
……..$Var[Y \lvert X=x]=\sigma_Y^2 (1-\rho^2)$.
• The parameter $\rho$ is the correlation coefficient of $X$ and $Y$.

Theorem 1 centers on the conditional distribution of $Y$ given $X=x$. We can also derive properties about the conditional distribution of $X$ given $Y=y$ as summarized in Theorem 2.

……..

Theorem 2
Suppose that $X$ and $Y$ have the bivariate normal distribution as defined by the pdf (1). Then the following properties hold.

• The marginal distribution of the random variable $Y$ is a normal distribution with mean $\mu_Y$ and variance $\sigma_Y^2$.
• The conditional distribution of $X$ conditioning on $Y=y$ is a normal distribution.
• The mean of the conditional distribution of $X$ conditioning on $Y=y$ is
.
……..$\displaystyle E[X \lvert Y=y]=\mu_X+\rho \frac{\sigma_X}{\sigma_Y} (y-\mu_Y)$.
• The variance of the conditional distribution of $X$ conditioning on $Y=y$ is
.
……..$Var[X \lvert Y=y]=\sigma_X^2 (1-\rho^2)$.
• The parameter $\rho$ is the correlation coefficient of $X$ and $Y$.

We only need to prove Theorem 1. Interestingly, the properties in Theorem 1 also imply the joint density function (1) as summarized in the following theorem.

……..

Theorem 3
Suppose that the jointly distributed random variables $X$ and $Y$ satisfy the following properties:

• The conditional distribution of $Y$, given $X=x$, is a normal distribution.
• The mean of the conditional distribution of $Y$ given $x$, $E[Y \lvert x]$, is a linear function of $x$.
• The variance of the conditional distribution of $Y$ given $x$, $Var[Y \lvert x]$, is a constant, i.e. it is not a function of $x$.
• The marginal distribution of $X$ is a normal distribution.

Then the joint pdf of $X$ and $Y$ is the same as the one in (1), i.e. $X$ and $Y$ have a bivariate normal distribution.

.

Theorem 1 and Theorem 3 combined show that the definition of bivariate normal using the pdf (1) is equivalent to the conditions in Theorem 1. Thus we can define bivariate normal distribution using either the pdf (1) or the conditions in Theorem 1 or Theorem 2.

The Bivariate Normal Density

In order to prove the theorems, it is helpful to reformulate the bivariate normal pdf in (1). Recall the quantity $W$ in the pdf (1). Before proving Theorem 1 and Theorem 3, it is helpful to simplify $W$, which of course is the following quantity.

.
$\displaystyle W=\frac{1}{1-\rho^2} \ \biggl[\biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2-2 \rho \biggl(\frac{x-\mu_X}{\sigma_X} \biggr) \biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr) +\biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr)^2\biggr]$
.

The quantity $W$ is equivalent to the following:

.
(2)……..$\displaystyle W=\frac{(y-b)^2}{\sigma_Y^2 (1-\rho^2)}+\frac{(x-\mu_X)^2}{\sigma_X^2}$.. where . $\displaystyle b=\mu_Y+ \rho \ \frac{\sigma_Y}{\sigma_X} (x-\mu_X)$
.

The fact (2) is established by the following.
.
\displaystyle \begin{aligned}(1-\rho^2) \ W&=\biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr)^2 -2 \rho \biggl(\frac{x-\mu_X}{\sigma_X} \biggr) \biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr)+\rho^2 \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2 \\&\ \ \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2-\rho^2 \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2 \\&=\biggl[\biggl(\frac{y-\mu_Y}{\sigma_Y} \biggr)-\rho \biggl(\frac{x-\mu_X}{\sigma_X} \biggr) \biggr]^2+(1-\rho^2) \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2 \\&=\biggl[\frac{y-\mu_Y}{\sigma_Y}-\frac{\rho \frac{\sigma_Y}{\sigma_X} (x-\mu_X)}{\sigma_Y} \biggr]^2+(1-\rho^2) \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2 \\&=\biggl[\frac{y-\biggl(\mu_Y+\rho \frac{\sigma_Y}{\sigma_X} (x-\mu_X) \biggr)}{\sigma_Y} \biggr]^2+(1-\rho^2) \biggl(\frac{x-\mu_X}{\sigma_X} \biggr)^2 \end{aligned}
.

It is clear that (2) follows from the last step. With the help of fact (2), the pdf $f(x,y)$ in (1) can be rewritten as follows:

.
(3)……..$\displaystyle f(x,y)=\frac{1}{2 \pi \ \sigma_X \ \sigma_Y \ \sqrt{1-\rho^2}} \ \ e^{\displaystyle -\frac{(y-b)^2}{2 \sigma_Y^2 (1-\rho^2)}-\frac{(x-\mu_X)^2}{2 \sigma_X^2}}$
.

The pdf in (3) can be rearranged as follows:

.
(4)……..\displaystyle \begin{aligned} f(x,y)&=\biggl[ \frac{1}{\sqrt{2 \pi} \sigma_X} \ e^{\displaystyle -\frac{(x-\mu_X)^2}{2 \sigma_X^2}}\biggr] \ \biggl[ \frac{1}{\sqrt{2 \pi} \sigma_Y \sqrt{1-\rho^2}} \ e^{\displaystyle -\frac{(y-b)^2}{2 \sigma_Y^2 (1-\rho^2)}}\biggr] \\&=C(x) \ D(y) \end{aligned}
.

For easier reference, the function in the first set of square brackets in (4) is called $C(x)$ and the function in the second set of square brackets in (4) is called $D(y)$. Note that $C(x)$ is the density function for the normal distribution with mean $\mu_X$ and variance $\sigma_X^2$. The function $D(y)$ is the density function for the normal distribution with mean $b$, as defined in (2), and variance $\sigma_Y^2 (1-\rho^2)$. Note that the $x$ in $D(y)$ is a fixed number. So $D(y)$ can be regarded as a conditional density of $Y$ given $x$.

Proof of Theorem 1

We now use the pdf in (4) to prove Theorem 1. Immediately, we observe that $f(x,y)$ is a valid pdf.

.
……..\displaystyle \begin{aligned} \int_{-\infty}^\infty \int_{-\infty}^\infty f(x,y) \ dy \ dx&\int_{-\infty}^\infty \int_{-\infty}^\infty C(x) \ D(y) \ dy \ dx\\&=\int_{-\infty}^\infty C(x) \int_{-\infty}^\infty D(y) \ dy \ dx \\&=\int_{-\infty}^\infty C(x) \ dx=1 \end{aligned}
.

The above double integral is 1 because each of $C(x)$ and $D(y)$ is a normal pdf. The next step is to show that the marginal distribution of $X$ is a normal distribution. The marginal density function $f_X(x)$ is the integral $\int_{-\infty}^\infty f(x,y) \ dy$. In this integral, the function $D(y)$ disappears since the integral of $D(y)$ with respect to $y$ is 1. Thus $f_X(x)=C(x)$, which, as mentioned above, is a normal density function with mean $\mu_X$ and variance $\sigma_X^2$. Thus the marginal distribution of $X$ is a normal distribution with mean $\mu_X$ and variance $\sigma_X^2$.

As a result of the preceding observations, (4) can be restated as follows:

.
(5)……..$\displaystyle f(x,y)=f_X(x) \ \biggl[ \frac{1}{\sqrt{2 \pi} \sigma_Y \sqrt{1-\rho^2}} \ e^{\displaystyle -\frac{(y-b)^2}{2 \sigma_Y^2 (1-\rho^2)}}\biggr]$
.

Consequently, the normal density function in the square brackets of (5) must be $f(y \lvert x)$, the density function for the conditional distribution for $Y$ given $X=x$. This means that the conditional distribution of $Y$ given $x$ is a normal distribution with the following mean and variance.

.
(6)……..$\displaystyle E[Y \ \lvert \ X=x]=\mu_Y+\rho \ \frac{\sigma_Y}{\sigma_X} \ (x-\mu_X)$
.
(7)……..$\displaystyle Var[Y \ \lvert \ X=x]=\sigma_Y^2 \ (1-\rho^2)$
.

According to Theorem 2 in this previous post, whenever the conditional mean $E[Y \lvert X=x]$ is a linear function, it must be of the form exactly as described in (6). Furthermore, the quantity $\rho$ in (6) must be the correlation coefficient of $X$ and $Y$. This concludes the proof of Theorem 1. $\square$

The proof of Theorem 2 would be similar (just switching the roles of $X$ and $Y$) and is not given here.

Proof of Theorem 3

We now assume the 4 conditions in Theorem 3 and derive the joint pdf as described in (1). As mentioned above, according to Theorem 2 in this previous post, since the conditional mean $E[Y \lvert x]$ is a linear function of $x$, $E[Y \lvert x]$ is of the same form as in (6). Using (6), we evaluate the variance of the conditional distribution $Y \lvert X=x$.

.
……..$\displaystyle \sigma_{Y \lvert x}^2=\int_{-\infty}^\infty \biggl[y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} \biggr]^2 \ f(y \lvert x) \ dy$
.

Then multiply both sides by $f_X(x)$.

.
……..\displaystyle \begin{aligned} \sigma_{Y \lvert x}^2 \ f_X(x)&=\int_{-\infty}^\infty \biggl[y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} \biggr]^2 \ f(y \lvert x) \ f_X(x) \ dy \\&=\int_{-\infty}^\infty \biggl[y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} \biggr]^2 \ f(x,y) \ dy \end{aligned}
.

Integrate both sides of the last expression with respect to $x$. Since $\sigma_{Y \lvert x}^2$ is assumed to be a constant, integrating a constant times a pdf gives that constant. Thus the left-hand side remains $\sigma_{Y \lvert x}^2$.

.

……..$\displaystyle \sigma_{Y \lvert x}^2=\int_{-\infty}^\infty \int_{-\infty}^\infty \biggl[y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} \biggr]^2 \ f(x,y) \ dy \ dx$
.

The right-hand side of the above is the following expectation.

.
……..$\displaystyle \sigma_{Y \lvert x}^2=E\biggl(\biggl[Y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} (X-\mu_X) \biggr]^2\biggr)$
.

Further developing the right-hand side, we have the following derivation.

.
……..\displaystyle \begin{aligned} \sigma_{Y \lvert x}^2&=E\biggl(\biggl[Y-\mu_Y-\rho \frac{\sigma_Y}{\sigma_X} (X-\mu_X) \biggr]^2\biggr)\\&=E \biggl[(Y-\mu_Y)^2-2 \rho \frac{\sigma_Y}{\sigma_X} (X-\mu_X) (Y-\mu_Y)+ \rho^2 \frac{\sigma_Y^2}{\sigma_X^2} (X-\mu_X)^2 \biggr]\\&=\sigma_Y^2-2 \rho \ \frac{\sigma_Y}{\sigma_X} \ \rho \sigma_X \sigma_Y+\rho^2 \ \frac{\sigma_Y^2}{\sigma_X^2} \ \sigma_X^2\\&=\sigma_Y^2-2 \rho^2 \ \sigma_Y^2+\rho^2 \ \sigma_Y^2\\&=\sigma_Y^2- \rho^2 \ \sigma_Y^2\\&=\sigma_Y^2 (1-\rho^2)\end{aligned}
.

Thus the variance of $Y \lvert X=x$ is the constant $\sigma_Y^2 (1-\rho^2)$. This means that $Y \lvert x$ has a normal distribution with the following mean and variance. Note that $Y \lvert x$ is assumed to be normal.

.
(8)……..$\displaystyle E[Y \ \lvert \ X=x]=\mu_Y+\rho \ \frac{\sigma_Y}{\sigma_X} \ (x-\mu_X)$
.
(9)……..$\displaystyle Var[Y \ \lvert \ X=x]=\sigma_Y^2 (1-\rho^2)$
.

The following shows the condition pdf of $Y \lvert x$ and the marginal pdf of $X$. Note that the marginal distribution of $X$ is also assumed to be normal.

.
(10)……..$\displaystyle f(y \lvert x)=\frac{1}{\sqrt{2 \pi} \ \sigma_Y \sqrt{1-\rho^2} } \ e^{-\frac{1}{2} \frac{(y-b)^2}{\sigma_Y^2 (1-\rho^2)}} \ \ \ \ \ -\infty
.
(11)……..$\displaystyle f_X(x)=\frac{1}{\sqrt{2 \pi} \ \sigma_X } \ e^{-\frac{1}{2} \frac{(x-\mu_X)^2}{\sigma_X^2}} \ \ \ \ \ \ \ \ -\infty
.

Note that the $b$ in $f(y \lvert x)$ is the mean of $Y \lvert x$, which is the expression in (8). The joint pdf of $X$ and $Y$ is obtained by multiplying (10) and (11), i.e. $f(x,y)=f(y \lvert x) f_X(x)$. The result is identical to the expression in (4) above, which is equivalent to the joint pdf in (1). Thus assuming the four conditions in Theorem 3 implies that the joint pdf is the bivraiate normal pdf as described in (1). This completes the proof of Theorem 3. $\square$

One More Theorem

The above discussion shows that there are two ways to define the bivariate normal distribution. One is to define it using the joint pdf (1). The pdf is hard to work with (e.g. it will be hard to evaluate probabilities using the pdf). Theorem 1 shows that the bivariate normal distribution satisfies the properties concerning the conditional distributions of $Y \lvert X=x$. The other way is to define the bivariate normal distribution using the properties concerning the conditional distributions of $Y \lvert X=x$ (as stated in Theorem 3). We can do so because these properties will lead to the same pdf in (1).

Whenever the random variables $X$ and $Y$ are independent, the covariance $\text{Cov}(X,Y)$ is zero and hence the correlation coefficient $\rho$ is zero. The converse is not true. Examples of zero covariance but dependent are given here. However, when $X$ and $Y$ are bivariate normal, zero covariance or zero correlation does imply independence.

Theorem 4
Suppose that $X$ and $Y$ have a bivariate normal distribution. Then $X$ and $Y$ are independent random variables if and only if the correlation coefficient $\rho$ is zero.

One direction does not require bivariate normality. As mentioned, if $X$ and $Y$ are independent, $\rho=0$. Suppose that $X$ and $Y$ are bivariate normal and that $\rho=0$. Then the pdf in (1) becomes the following.

.
……..\displaystyle \begin{aligned} f(x,y)&=\frac{1}{2 \pi \sigma_X \sigma_Y} \ e^{-\frac{1}{2} \biggl[ \frac{(x-\mu_X)^2}{\sigma_X^2}+\frac{(y-\mu_Y)^2}{\sigma_Y^2} \biggr] } \\&=\frac{1}{\sqrt{2 \pi} \ \sigma_X} \ e^{-\frac{1}{2} \ \frac{(x-\mu_X)^2}{\sigma_X^2}} \times \frac{1}{\sqrt{2 \pi} \ \sigma_Y} \ e^{-\frac{1}{2} \ \frac{(y-\mu_Y)^2}{\sigma_Y^2}} \end{aligned}

The above is a product of the marginal pdf of $X$ and the marginal pdf of $Y$. Thus the conditional pdf $f(y \lvert x)$ is simply the unconditional pdf $f_Y(y)$. Likewise, the conditional pdf $f(x \lvert y)$ is simply the unconditional pdf $f_X(x)$. The knowledge of $X=x$ is simply extraneous information.

.

Examples

We now examine some examples.

Example 1
Consider the bivariate normal distribution with parameters $\mu_X=60$, $\sigma_X=10$, $\mu_Y=70$, $\sigma_Y=5$ and $\rho=0.8$. The following is the least squares regression line of $Y$ on $X$.

.
……..$\displaystyle y=70+0.8 \ \frac{5}{10} \ (x-60)=46+0.4 \ x$
.

This line gives the conditional mean $E[Y \lvert x]$ for each $x$. Because the $X$ and $Y$ are positively correlated, the least squares line is increasing – the larger the $x$, the larger the mean of $Y$ given $x$. The following diagram is the graph of this least squares line.
.

Figure 1

.

The solid green line in Figure 1 is the least squares regression line $y=46+0.4x$. The vertical dotted line is the unconditional mean of $X$ and the horizontal dotted line is the unconditional mean of $Y$. Note that the least squares line always passes through the point $(\mu_X,\mu_Y)$.

The variance of the conditional distribution $Y \lvert x$ is constant regardless of $x$. It is $\sigma_Y^2=25 \cdot (1-0.8^2)=9$. Then the standard deviation of $Y \lvert x$ is 3.

Consider $x=30$. The conditional distribution of $Y \lvert 30$ is a normal distribution with mean $46+0.4 \cdot 30=58$ and standard deviation 3. About 99.7% of the probabilities in a normal distribution are within 3 standard deviations from the mean. Then about 99.7% of the observations for $Y$ given $X=30$ are expected to be within the interval (49, 67). When sampling from this normal distribution, it is rare to observe data outside of this range.

Another example. Consider $x=90$. The conditional distribution of $Y \lvert 90$ is a normal distribution with mean $46+0.4 \cdot 90=82$ and standard deviation 3. Then about 99.7% of the observations for $Y$ given $X=90$ are expected to be within the interval (73, 91). When sampling from this normal distribution, it is rare to observe data outside of this range.

As the mean of the normal distribution increases (along the green least squares line), the 99.7% range of the normal distribution moves up. This is illustrated in the following graph.

.

Figure 2

.

The two red lines in Figure 2 have the same slope as the least square line $y=46+0.4x$, but one is 9 units above and the other is 9 units below (in terms of vertical distances). Of course, 9 is 3 times the standard deviation of the conditional distribution for $Y$ given $x$. As equations, the two red lines are $y=37+0.4x$ and $y=55+0.4x$.

For each $x$, observations of the distribution for $Y$ given $x$ fall in the vertical line that goes through the point $(x,0)$. About 99.7% of these observations lie in the line segment within the two red lines. Thus the strip formed by the two red lines contains essentially all of the observations of the conditional distributions for $Y \lvert x$. Consequently, the bivariate normal density $f(x,y)$ is concentrated in this strip around the least squares regression line. How narrow the strip is depends on the size of the constant variance of $Y \lvert x$.

We next calculate probabilities suggested by Figure 3 below.

.

Figure 3

.

The two blue horizontal lines, $y=65$ and $y=75$, are one standard deviation from 70, the mean of $Y$. The area of in this horizontal strip is the probability $P[65. Since the strip contains probability within one standard deviation, $P[65 would be around 0.68. Using a TI84+ calculator, this probability is $P[65.

Let’s calculate $P[65 for several x values (30, 40, 50, 60, 70, 80 and 90). This is further illustrated in the following Figure.

.

Figure 4

.

Figure 4 looks very busy, but it is Figure 3 with short vertical bars added for several $x$ values (30, 40, 50, 60, 70, 80 and 90). Based on the discussion above, the short vertical bars would be the range where 99.7% of the normal distribution occurs. For any vertical red bar that has a small (or even negligible) intersection with the horizontal strip, formed by the two blue lines $y=65$ and $y=75$, the probability $P[65 is small. Based on Figure 4, $P[65 should be large whereas $P[65 should be small. With this in mind, the following table shows the probabilities at the indicated x values, all calculated using a TI84+ calculator.
.
Table 1

x mean st dev $\bold P \bold [ \bold 6 \bold 5< \bold Y< \bold 7 \bold 5 \ \lvert \ \bold x]$
30 58 3 0.009815
40 62 3 0.15865
50 66 3 0.62921
60 70 3 0.90442
70 74 3 0.62921
80 78 3 0.15865
90 82 3 0.009815

Example 2
Consider the same bivariate normal distribution discussed in Example 1. Suppose that for selected values of $x$, we sample the normal distribution $Y \lvert X=x$ four times. Compute the probability $P[65<\overline{Y}<75 \ \lvert \ X=x]$ for the $x$ values of 30, 40, 50, 60, 70, 80 and 90 where $\overline{Y}$ is the mean of the 4 sample items.

For each $x$, the mean of $\overline{Y}$ given $x$ is the same as $E[Y \lvert X=x]$. However the standard deviation is smaller. It is $\sigma_{Y \lvert x} / \sqrt{4}=3/2=1.5$. The following table shows the normal probabilities, calculated using a TI84+ calculator. The probabilities $P[65 are shown in the last column for comparison.

.
Table 2

x mean st dev $\bold P \bold [ \bold 6 \bold 5< \overline{Y}< \bold 7 \bold 5 \ \lvert \ \bold x]$ $\bold P \bold [ \bold 6 \bold 5< \bold Y< \bold 7 \bold 5 \ \lvert \ \bold x]$
sample mean
30 58 1.5 0.0000015323 0.009815
40 62 1.5 0.02275 0.15865
50 66 1.5 0.74751 0.62921
60 70 1.5 0.99914 0.90442
70 74 1.5 0.74751 0.62921
80 78 1.5 0.02275 0.15865
90 82 1.5 0.0000015323 0.009815

To gain better insight, Figure 5 below shows the narrower strip around the least squares line (Figure 4 is repeated for comparison).

.

Figure 4 (repeated for comparison)

.

Figure 5

.

The two yellow lines in Figure 5 are 3 standard deviations from the least squares line. This time the standard deviation is 1.5, which is the standard deviation of $\overline{Y} \lvert X=x$. The strip formed by the yellow lines is narrower than the strip in Figure 4. The vertical yellow bars indicate the 99.7% range for the selected normal distributions. As before the two blue horizontal lines are one standard deviation away from the mean of $Y$ (at 65 and 75). The size of the probability $P[65<\overline{Y}<75 \ \lvert \ X=x]$ depends on the intersection of the vertical yellow bar and the horizontal strip formed by the two blue lines. At $x=60$, the vertical yellow bar is entirely inside the horizontal strip, leading to a probability of 0.99914. At $x=30$ and at $x=90$, the vertical yellow bar is entirely away from the horizontal strip. Thus both $P[65<\overline{Y}<75 \ \lvert \ X=30]$ and $P[65<\overline{Y}<75 \ \lvert \ X=90]$ are negligible.

At $x=40$ and at $x=80$, the vertical yellow bar intersects with the horizontal strip in only a small segment. Thus the probability $P[65<\overline{Y}<75 \ \lvert \ x]$ is small for both these two $x$ values. On the other hand, the vertical red bar in Figure 4 intersects with the horizontal strip in a larger segment, leading to a larger $P[65.

The Next Post

The next post is a further discussion on bivariate normal distribution. Practice problems on bivariate normal distribution are available here.

.

.

.

Dan Ma math

Daniel Ma mathematics

Dan Ma stats

Daniel Ma statistics

Dan Ma statistical

Daniel Ma statistical

$\copyright$ 2018 – Dan Ma

Advertisements

### 3 responses

18 12 2018

[…] post extends the discussion of the bivariate normal distribution started in this post from a companion blog. Practice problems are given in the next […]

Like

1 01 2019

[…] post is a continuation of the preceding post on bivariate normal distribution. The preceding post gives one characterization of the bivariate normal distribution. This post […]

Like

1 01 2019

[…] to reinforce the concept of bivariate normal distribution discussed in two posts – one is a detailed introduction to bivariate normal distribution and the other is a further discussion that brings out more […]

Like