An Introduction to Order Statistics

25 01 2018

This is an introduction to order statistics, focusing on basic notions and calculations.

Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a continuous distribution with cumulative distribution function F(x) and density function f(x). We can rank the sample items from smallest to the largest. For convenience, the smallest sample item is denoted by X_{(1)}, the second smallest sample item is denoted by X_{(2)} and so on. The largest sample item is then X_{(n)}. These ordered values are called the order statistics corresponding to the sample X_1,X_2,\cdots,X_n. Because the items in the sample are random, the order statistics are random variables too. The goal of this post is to discuss the probability distributions of the order statistics both individually and jointly.

We only focus on samples drawn from a continuous distribution in order to avoid the situation that two sample items are equaled (i.e. a tie). Thus we assume sample items are all distinct and that the order statistics are increasing, i.e. X_{(1)}<X_{(2)}<\cdots<X_{(n)}.

As mentioned, X_{(1)} is the minimum order statistic and X_{(n)} is the maximum order statistic. In general, X_{(j)} is called the jth order statistic where j=1,2,\cdots,n.

The Joint Density Function of Order Statistics

Given the population density f(x), we can derive the joint density function of the order statistics X_{(1)},X_{(2)},\cdots,X_{(n)}.

Fact 1
Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a distribution with density function f(x). Then the following is the joint density function of the order statistics X_{(1)},X_{(2)},\cdots,X_{(n)}.

    ……f_{X_{(1)},X_{(2)},\cdots,X_{(n)}}(x_1,x_2,\cdots,x_n)= n! \ f(x_1) \ f(x_2) \cdots f(x_n)

The support of the joint density is the n-dimensional region x_1<x_2<\cdots<x_n.

For any point (x_1,x_2,\cdots,x_n) in the support, any permutation of the numbers x_1,x_2,\cdots,x_n would lead to the same ordered values. There are n! many such permutations. Furthermore, the density of any one such permutation is \ f(x_1) \ f(x_2) \cdots f(x_n). Thus Fact 1 follows.

With the joint density established, a great deal of information about order statistics can be derived from it. For example, we can integrate f_{X_{(1)},X_{(2)},\cdots,X_{(n)}}(x_1,x_2,\cdots,x_n) to sum out all the variables except one, thus producing the marginal density function of an order statistic X_{(j)}. We can sum out all the variables except two, thus producing the joint density of two order statistics X_{(i)} and X_{(j)}. We can also determine the probability distribution of the range of the sample R=X_{(n)}-X_{(1)}. The rest of the post is to present these and other basic calculations.

Order statistics can of course be viewed as a topic in probability. Order statistics are also important in statistics since they can be applied in statistical inference. For example, they can be used to determine simple statistics such as sample median (and other sample percentiles) and the sample range. Order statistics are often employed in non-parametric inference procedures.

The Distribution of an Order Statistic

We now discuss the distribution of a single order statistic. As mentioned, the density for the jth order statistic X_{(j)} can be obtained by integrating ………………………….. f_{X_{(1)},X_{(2)},\cdots,X_{(n)}}(x_1,x_2,\cdots,x_n) to sum out x_k for all k \ne j. However, there is a more direct and natural way of deriving the CDF and the density function of X_{(j)}.

Fact 2
Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a distribution with CDF F(x) and density function f(x). Then the following is the cumulative distribution function (CDF) of the order statistics X_{(j)} where j=1,2,\cdots,n.

    ……\displaystyle F_{X_{(j)}}(x)=P(X_{(j)} \le x)=\sum \limits_{k=j}^n \frac{n!}{k! (n-k)!} [F(x)]^k \ [1-F(x)]^{n-k}

The support of the CDF is identical to the support of the population CDF F(x).

Fact 2 is based on a binomial argument. The event X_{(j)} \le x occurs when at least j of the sample items \le x. When observing each sample item X_i, focus on two distinct outcomes: X_i \le x or X_i>x. Consider the former as a success and the probability of a success is F(x). Thus observing the random sample X_1,X_2,\cdots,X_n is like performing a series of n independent Bernoulli trials. Then P(X_{(j)} \le x) is the probability of having j or more successes in the binomial experiment.

The density function of X_{(j)} can be derived by taking derivative of the CDF.

Fact 3
Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a distribution with CDF F(x) and density function f(x). Then the following is the density function of the order statistics X_{(j)} where j=1,2,\cdots,n.

    ……\displaystyle f_{X_{(j)}}(x)=\frac{n!}{(j-1)! \ 1! \ (n-j)!} \ [F(x)]^{j-1} \ f(x) \ [1-F(x)]^{n-j}

The support of the density function is identical to the support of the population density f(x).

Mathematically, the density function f_{X_{(j)}}(x) can be derived from the CDF in Fact 2. However, there is a clear and natural way to view the density function in Fact 3. It can be viewed as a multinomial probability. Here’s the thought process for this idea. Think of the density function f_{X_{(j)}}(x) as the probability that the jth order statistic X_{(j)} is right around x. So there must be j-1 sample items less than x, exactly one sample item at x and n-j sample items above x. One way this can happen is:

    ……[F(x)]^{j-1} \ f(x) \ [1-F(x)]^{n-j}

The first term in the above expression is the probability that j-1 sample terms are less than x. The second term is the probability that one sample item is right around x. The third term is the probability that n-j sample items are above x. But this is only one way. To capture all possibilities, we multiply it by the multinomial coefficient. The result is the density function indicated in Fact 3.

When the order statistic X_{(j)} is used as an estimator for a parameter \theta of the population distribution, the CDF in Fact 2 and the density function in Fact 3 give us information on the sampling distribution of the estimator, potentially helping us determine the goodness of the estimator.

The Joint Distribution of Two Order Statistics

When we are only interested in the joint behavior of two order statistics, we can derive the joint density f_{X_{i},X_{j}}(x,y). Mathematically, the joint density can be derived by integrating the joint density in Fact 1 to sum out all variables except for x_i and x_j. However, the joint density function can be derived (and remembered) using a heuristic argument similar to the one in the preceding section for f_{X_{j}}(x).

Fact 4
Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a distribution with CDF F(x) and density function f(x). Then the following is the joint density function of the order statistics X_{(i)} and X_{(j)} where i<j and i,j=1,2,\cdots,n.

    ……\displaystyle \begin{aligned} f_{X_{(i)},X_{(j)}}(x,y)&=C \times [F(x)]^{i-1} \times f(x) \times [F(y)-F(x)]^{j-i-1} \\&\times f(y) \ [1-F(x)]^{n-j}  \end{aligned}

where C is the multinomial coefficient determined by

    ……\displaystyle C=\frac{n!}{(i-1)! \ 1! \ (j-i-1)! \ 1! \ (n-j)!}

The support of the density function is the region x<y in the two-dimensional xy-plane.

As mentioned, the joint density function can be derived using a heuristic argument, or memorization scheme, similar to the one in the preceding section. In this scheme, the joint density can be viewed as a multinomial probability of 5 different categories – X<x, X \approx x, x<X<y, X \approx y and y<X. When the n sample items are observed, we are interested in all the scenarios such that the ith ordered item is in the category X \approx x and the jth ordered item is in the category X \approx y. Count the number of items that fall into each category and multiply the respective probabilities. Of course, do not forget to multiply with the multinomial coefficient.

The Range of a Sample

Another distribution that can be derived from order statistics is that of the range, which is defined to be R=X_{(n)}-X_{(1)}, i.e. the maximum statistic minus the minimum statistic. Mathematically, the CDF F_R(r)=P(R \le r) can be derived by integrating the joint density f_{X_{1},X_{n}}(x,y) over the region y-x \le r. From this idea, we can derive a useful form of F_R(r)=P(R \le r). First, the following is the joint density function f_{X_{1},X_{n}}(x,y).

    ……\displaystyle f_{X_{1},X_{n}}(x,y)=\frac{n!}{(n-2)!} \ f(x) \ [F(y)-F(x)]^{n-2} \ f(y) \ \ \ \ \ x<y

Consider the following derivation.

    ……\displaystyle \begin{aligned} P(R \le r)&=\iint_{y-x \le r} f_{X_{1},X_{n}}(x,y) \ dy \ dx\\&=\int_{-\infty}^\infty \int_{x}^{x+r} \frac{n!}{(n-2)!} \ [F(y)-F(x)]^{n-2} \ f(x) \ f(y) \ dy  \ dx  \end{aligned}

The inner integral can be evaluated by a change of variable with u=F(y)-F(x) and du=f(y) \ dy.

    ……\displaystyle \begin{aligned} \int_{x}^{x+r} [F(y)-F(x)]^{n-2} \ f(y) \ dy \ dx&=\int_0^{F(x+r)-F(x)} u^{n-2} \ du \\&=\frac{1}{n-1} [F(x+r)-F(x)]^{n-1}  \end{aligned}

With the above integral, we have the following fact.

Fact 5
Suppose that X_1,X_2,\cdots,X_n is a random sample drawn from a distribution with CDF F(x) and density function f(x). Then the following integral gives the CDF of the range R=X_{(n)}-X_{(1)}.

    ……\displaystyle F_R(r)=P(R \le r)=\int_{-\infty}^\infty n \ [F(x+r)-F(x)]^{n-1} \ f(x) \ dx

where r belongs to the support of the distribution for X.

After F_R(r) is evaluated, the density function f_R(r) can be obtained by taking the derivative of F_R(r).

One comment about the integral in Fact 5. If the support of the distribution of X is an interval of finite length, the integral may have to be split into two integrals. The CDF F(x+r) may become 1 at some x values. If that is the case, one integral reflects F(x+r) \le 1 and a second integral has 1 in place of F(x+r). See the example below. If the support of X is unbounded, e.g. like that of the exponential distribution, the integral does not have to be split up.

Examples

It is helpful to go through examples demonstrating the calculations discussed here. More examples are shown in the next post. In the remainder of the post, we demonstrate how to set up the density functions.

Example 1
Suppose that the sample X_1,X_2,X_3,X_4,X_5,X_6,X_7 is drawn from a uniform distribution on the interval (0,2). Write the density functions for the following distributions.

  • The joint distribution of X_{(1)},\cdots,X_{(7)}.
  • The median X_{(4)}.
  • The joint distribution of X_{(1)} and X_{(7)}.
  • The joint distribution of X_{(3)} and X_{(5)}.
  • The range R=X_{(n)}-X_{(1)}.

All the density functions are derived from F(x)=\frac{x}{2} and f(x)=\frac{1}{2}, the CDF and density function of the uniform distribution, respectively. The following gives the first two density functions.

    ……\displaystyle \begin{aligned} f_{X_{(1)},\cdots,X_{(7)}}(x_1,\cdots,x_7)&=7! \ \biggl(\frac{1}{2} \biggr)^7=\frac{315}{8} \\&\ \ \ \ \ \  0<x_1<x_2<\cdots<x_7<2  \end{aligned}

    ……\displaystyle \begin{aligned}  f_{X_{(4)}}(x)&=\frac{7!}{3! \ 1! \ 3!} \biggl[\frac{x}{2} \biggr]^3 \ \frac{1}{2} \ \biggl[1-\frac{x}{2} \biggr]^3 \\&=\frac{140}{2^7} x^3 \ (2-x)^3 \\&=\frac{140}{128} (8x^3-12x^4+6x^5-x^6) \ \ \ \ \ \ 0<x<2  \end{aligned}

The first one, the joint density of the 7 order statistics, is obtained based on Fact 1. The second one, the density function of the 4th order statistic, which is also the sample median, is obtained based on Fact 3.

The following gives the next two density functions.

    ……\displaystyle \begin{aligned} f_{X_{(1)},X_{(7)}}(x,y)&=\frac{7!}{5!} \ \frac{1}{2} \ \biggl[ \frac{y}{2}-\frac{x}{2} \biggr]^5 \ \frac{1}{2} \\&=\frac{42}{2^7} \ (y-x)^5 \ \ \ \ \ \ 0<x<y<2  \end{aligned}

    ……\displaystyle \begin{aligned} f_{X_{(3)},X_{(5)}}(x,y)&=\frac{7!}{2! \ 2!} \ \biggl[\frac{x}{2} \biggr]^2 \ \frac{1}{2} \ \biggl[ \frac{y}{2}-\frac{x}{2} \biggr] \ \frac{1}{2} \ \biggl[1-\frac{y}{2} \biggr]^2 \\&=\frac{1260}{2^7} \ x^2 \ (y-x) \ (2-y)^2 \ \ \ \ \ \ 0<x<y<2  \end{aligned}

The function f_{X_{(1)},X_{(7)}}(x,y) is the joint density function of the minimum statistic and the maximum statistic and is obtained based on Fact 4. The function f_{X_{(3)},X_{(5)}}(x,y) is the joint density function of the 3rd order statistic and the 5th order statistic.

For the sample range R=X_{(7)}-X_{(1)}, we first determine its CDF by evaluating the integral indicated in Fact 5.

    ……\displaystyle F_R(r)=\int_0^2 7 \ [ F(x+r)-F(x) ]^6 \ f(x) \ dx

Because the CDF F(x+r) can be 1 after some point, we need to split this integrals into two. The cutoff point is 2-r.

    ……\displaystyle \begin{aligned} F_R(r)&=\int_0^{2-r} 7 \ \biggl[\frac{x+r}{2}-\frac{x}{2} \biggr]^6 \ \frac{1}{2} \ dx+\int_{2-r}^2 7 \ \biggl[1-\frac{x}{2} \biggr]^6 \ \frac{1}{2} \ dx \\&=\frac{7}{2^7} \ r^6 \ (2-r)+\frac{1}{2^7} \ r^7 \\&=\frac{1}{2^7} (14 r^6 - 6 r^7) \ \ \ \ \ \ \ 0<r<2 \end{aligned}

    ……\displaystyle f_R(r)=\frac{1}{2^7} \ (84 r^5-42 r^6)  \ \ \ \ \ \ \ \ 0<r<2

The density function of the sample range is the derivative of its CDF. With the CDF and density function known, other distributional quantities can be derived.

Additional examples are shown in the next post.

Practice problems on order statistics are found in this companion blog.

Dan Ma math

Daniel Ma mathematics

Dan Ma stats

Daniel Ma statistics

Dan Ma statistical

Daniel Ma statistical

\copyright 2018 – Dan Ma

Advertisements

Actions

Information

4 responses

28 01 2018
Illustrative examples for order statistics | Mathematical Statistics

[…] The preceding post is an introduction to order statistics. This post gives examples demonstrating the calculation discussed in the preceding post. […]

Like

28 01 2018
Illustrative examples of order statistics | Mathematical Statistics

[…] The preceding post is an introduction to order statistics. This post gives examples demonstrating the calculation discussed in the preceding post. […]

Like

28 01 2018
Practice Problem Set 1 – Order Statistics | Probability and Statistics Problem Solve

[…] The first blog post from the companion blog is an introduction to order statistics. That post presents the probability distributions of the order statistics, both individually and jointly. The second post presents basic examples illustrating how to calculate the order statistics. […]

Like

30 01 2018
Estimating percentiles using order statistics | Mathematical Statistics

[…] two posts preceding this one focus on the topics of order statistics. One is an introduction. The other post gives examples demonstrating how to perform the calculation. A post in a companion […]

Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s




%d bloggers like this: