This is an introduction to order statistics, focusing on basic notions and calculations.
Suppose that is a random sample drawn from a continuous distribution with cumulative distribution function and density function . We can rank the sample items from smallest to the largest. For convenience, the smallest sample item is denoted by , the second smallest sample item is denoted by and so on. The largest sample item is then . These ordered values are called the order statistics corresponding to the sample . Because the items in the sample are random, the order statistics are random variables too. The goal of this post is to discuss the probability distributions of the order statistics both individually and jointly.
We only focus on samples drawn from a continuous distribution in order to avoid the situation that two sample items are equaled (i.e. a tie). Thus we assume sample items are all distinct and that the order statistics are increasing, i.e. .
As mentioned, is the minimum order statistic and is the maximum order statistic. In general, is called the th order statistic where .
The Joint Density Function of Order Statistics
Given the population density , we can derive the joint density function of the order statistics .
Fact 1 

Suppose that is a random sample drawn from a distribution with density function . Then the following is the joint density function of the order statistics .
The support of the joint density is the dimensional region . 
For any point in the support, any permutation of the numbers would lead to the same ordered values. There are many such permutations. Furthermore, the density of any one such permutation is . Thus Fact 1 follows.
With the joint density established, a great deal of information about order statistics can be derived from it. For example, we can integrate to sum out all the variables except one, thus producing the marginal density function of an order statistic . We can sum out all the variables except two, thus producing the joint density of two order statistics and . We can also determine the probability distribution of the range of the sample . The rest of the post is to present these and other basic calculations.
Order statistics can of course be viewed as a topic in probability. Order statistics are also important in statistics since they can be applied in statistical inference. For example, they can be used to determine simple statistics such as sample median (and other sample percentiles) and the sample range. Order statistics are often employed in nonparametric inference procedures.
The Distribution of an Order Statistic
We now discuss the distribution of a single order statistic. As mentioned, the density for the th order statistic can be obtained by integrating ………………………….. to sum out for all . However, there is a more direct and natural way of deriving the CDF and the density function of .
Fact 2 

Suppose that is a random sample drawn from a distribution with CDF and density function . Then the following is the cumulative distribution function (CDF) of the order statistics where .
The support of the CDF is identical to the support of the population CDF . 
Fact 2 is based on a binomial argument. The event occurs when at least of the sample items . When observing each sample item , focus on two distinct outcomes: or . Consider the former as a success and the probability of a success is . Thus observing the random sample is like performing a series of independent Bernoulli trials. Then is the probability of having or more successes in the binomial experiment.
The density function of can be derived by taking derivative of the CDF.
Fact 3 

Suppose that is a random sample drawn from a distribution with CDF and density function . Then the following is the density function of the order statistics where .
The support of the density function is identical to the support of the population density . 
Mathematically, the density function can be derived from the CDF in Fact 2. However, there is a clear and natural way to view the density function in Fact 3. It can be viewed as a multinomial probability. Here’s the thought process for this idea. Think of the density function as the probability that the th order statistic is right around . So there must be sample items less than , exactly one sample item at and sample items above . One way this can happen is:

……
The first term in the above expression is the probability that sample terms are less than . The second term is the probability that one sample item is right around . The third term is the probability that sample items are above . But this is only one way. To capture all possibilities, we multiply it by the multinomial coefficient. The result is the density function indicated in Fact 3.
When the order statistic is used as an estimator for a parameter of the population distribution, the CDF in Fact 2 and the density function in Fact 3 give us information on the sampling distribution of the estimator, potentially helping us determine the goodness of the estimator.
The Joint Distribution of Two Order Statistics
When we are only interested in the joint behavior of two order statistics, we can derive the joint density . Mathematically, the joint density can be derived by integrating the joint density in Fact 1 to sum out all variables except for and . However, the joint density function can be derived (and remembered) using a heuristic argument similar to the one in the preceding section for .
Fact 4 

Suppose that is a random sample drawn from a distribution with CDF and density function . Then the following is the joint density function of the order statistics and where and .
…… where is the multinomial coefficient determined by
The support of the density function is the region in the twodimensional plane. 
As mentioned, the joint density function can be derived using a heuristic argument, or memorization scheme, similar to the one in the preceding section. In this scheme, the joint density can be viewed as a multinomial probability of 5 different categories – , , , and . When the sample items are observed, we are interested in all the scenarios such that the th ordered item is in the category and the th ordered item is in the category . Count the number of items that fall into each category and multiply the respective probabilities. Of course, do not forget to multiply with the multinomial coefficient.
The Range of a Sample
Another distribution that can be derived from order statistics is that of the range, which is defined to be , i.e. the maximum statistic minus the minimum statistic. Mathematically, the CDF can be derived by integrating the joint density over the region . From this idea, we can derive a useful form of . First, the following is the joint density function .

……
Consider the following derivation.

……
The inner integral can be evaluated by a change of variable with and .

……
With the above integral, we have the following fact.
Fact 5 

Suppose that is a random sample drawn from a distribution with CDF and density function . Then the following integral gives the CDF of the range .
…… where belongs to the support of the distribution for . After is evaluated, the density function can be obtained by taking the derivative of . 
One comment about the integral in Fact 5. If the support of the distribution of is an interval of finite length, the integral may have to be split into two integrals. The CDF may become 1 at some values. If that is the case, one integral reflects and a second integral has 1 in place of . See the example below. If the support of is unbounded, e.g. like that of the exponential distribution, the integral does not have to be split up.
Examples
It is helpful to go through examples demonstrating the calculations discussed here. More examples are shown in the next post. In the remainder of the post, we demonstrate how to set up the density functions.
Example 1
Suppose that the sample is drawn from a uniform distribution on the interval . Write the density functions for the following distributions.
 The joint distribution of .
 The median .
 The joint distribution of and .
 The joint distribution of and .
 The range .
All the density functions are derived from and , the CDF and density function of the uniform distribution, respectively. The following gives the first two density functions.

……
……
The first one, the joint density of the 7 order statistics, is obtained based on Fact 1. The second one, the density function of the 4th order statistic, which is also the sample median, is obtained based on Fact 3.
The following gives the next two density functions.

……
……
The function is the joint density function of the minimum statistic and the maximum statistic and is obtained based on Fact 4. The function is the joint density function of the 3rd order statistic and the 5th order statistic.
For the sample range , we first determine its CDF by evaluating the integral indicated in Fact 5.

……
Because the CDF can be 1 after some point, we need to split this integrals into two. The cutoff point is .

……
……
The density function of the sample range is the derivative of its CDF. With the CDF and density function known, other distributional quantities can be derived.
Additional examples are shown in the next post.
Practice problems on order statistics are found in this companion blog.
Dan Ma math
Daniel Ma mathematics
Dan Ma stats
Daniel Ma statistics
Dan Ma statistical
Daniel Ma statistical
2018 – Dan Ma
[…] The preceding post is an introduction to order statistics. This post gives examples demonstrating the calculation discussed in the preceding post. […]
LikeLike
[…] The preceding post is an introduction to order statistics. This post gives examples demonstrating the calculation discussed in the preceding post. […]
LikeLike
[…] The first blog post from the companion blog is an introduction to order statistics. That post presents the probability distributions of the order statistics, both individually and jointly. The second post presents basic examples illustrating how to calculate the order statistics. […]
LikeLike
[…] two posts preceding this one focus on the topics of order statistics. One is an introduction. The other post gives examples demonstrating how to perform the calculation. A post in a companion […]
LikeLike