The DEMOGRAPHY-STATISTICS-INFORMATION TECHNOLOGY Letter
#10 Sep 2016 Download PDF
The life table is a tool for describing levels and age patterns of mortality in a population using a standard set of descriptive statistics. Mastery of life table concepts and methods is essential for serious analysis of mortality. The importance of life table methods is more general, however, because the ideas that apply to the analysis of length of life apply in many other settings.
Life tables may present a stumbling block even to serious and determined students of demography. It is worth considering why this is so, and several possibilities may be suggested. One is the sheer volume of material to be absorbed. Reasonable mastery requires weeks of effort, not hours or days.
The notation used for life table columns and formulas requires a familiarity with basic mathematics that comes more easily for some than for others. A solid understanding of the distinction between period and cohort life tables requires experience using life table methods to study mortality in actual human populations.
A realistic appreciation of the effort required to learn life table methods may be the best way to avoid discouragement. That said, let’s begin.
A cohort life table describes the mortality experience of a birth cohort, a group of persons born during a particular year or other time period. A cohort life table cannot be constructed until all or nearly all of the persons in the cohort have died. This means, firstly, that the persons must have been born a very long time ago, and secondly, that the mortality risks described by the table amalgamate mortality conditions over a very long period of time, nearly a century.
If we are interested in mortality during particular year or time period, we want a “period” life table. A period life table looks just like a cohort life table, but instead of describing what happened in a birth cohort over the past 100 or so years, it describes the possible future mortality of the cohort born during the period to which the life table refers, this on the assumption that persons in this cohort experience the mortality risks described by the life table throughout their lives.
Most life tables are period life tables, partly because few countries have the data required to construct cohort life tables, and partly because there is more interest in current than in past mortality.
The importance of cohort life tables is mainly conceptual. Life table concepts are more easily explained and understood for cohort life tables. Learning how cohort life tables are defined and constructed is the best preparation for learning period life tables.
The following section presents an example of a life table. Like most life tables encountered in practice, it is a period table, but to explain the meaning of the table the reader is asked to imagine that it is a cohort life table constructed from a list of the ages at death of persons born during a year in the distant past.
Table 1 shows a life table for Austrian males based on mortality risks observed in 1992. Understanding the table requires a few definitions. A person’s age at any given time is the time elapsed since the person’s birth. Age in completed years (or “age at last birthday”) is the greatest integer less than age.
Age may be referred to as exact age when it is unclear whether age or age in completed years is intended. In practice, “age” tends to be used without qualification, the reader being expected to figure out from context what meaning is intended. This is almost always possible, and not difficult, but it requires experience and practice.
Rows of the table correspond to age intervals. The number in the x column is the (exact) age at which the age interval begins. The number in the n column is the width of the interval. The age intervals are understood to be what mathematicians call “left closed, right open”, meaning that they include persons whose exact age is x but exclude persons whose exact age is x + n. The last interval is nearly always “open-ended”, in this case age 85 years and over.
The subscript notation in the column headings was invented by actuaries and has been in use for well over a century. The following subscript x denotes exact age. If the values in the column refer to an age interval, the leading subscript, sometimes called a “prescript”, gives the width of the interval. The prescript is omitted if the width is one year. Putting the width of the open-ended interval to ∞ is a convenient formalism.
The nmx column is not strictly a life table column, rather it is the primary input from which most period life tables are constructed. The nax column, used in section 5 below and discussed in detail in section section 7, provides ancilliary information used to construct the table. These two columns may shown or not, depending on how the life table was constructed and on how much detail the presenter choses to show.
The remaining six columns of Table 1 tend to appear in all life tables. They are discussed in the following three sections.
To learn the meaning of the life table columns it is useful to imagine that Table 1 is a cohort life table for persons born during a year in the distant past and was constructed from a list of the exact ages at death of all persons born during this year. The nmx column is not needed if the life table is constructed in this way.
The nqx, lx, and ndx columns are defined as follows. The lx and ndx columns are discussed in this section, the nqx column in the following section.
If we were really calculating a cohort life table from a list of ages at death, we would tally the number of deaths in each age interval and divide by the number of persons in the cohort to get the ndx column values, cumulate the ndx column upward from the last row to get the lx column values, and divide the ndx values by the lx values, row by row, to get the ndx column values. If the life table were indeed constructed in this way it would be natural to put the columns in this order.
The different order of the columns in Table 1 is due to its being, in fact, a period life table. As explained in section 8, construction of a period life table begins with calculation of the nqx from age-specific death rates nmx followed by calculation of the lx and ndx columns. This explains the order of the columns in Table 1.
The ndx and lx columns will be familiar to students of introductory statistics. The ndx column gives the frequency distribution of age at death for members of the birth cohort. The number in the row for given x and n is the proportion of persons in the cohort for whom x xi < x + n.
The lx column is the cumulative frequency distribution, except that the direction of cumulation is reversed: instead of the proportion of persons dying before age x, we have the proportion surviving to age x, which is the proportion dying after age x. All persons “survive to age 0”, so l0 = 1.
The solid dots in Figure 1 show the frequency distribution of deaths (ndx) in Table 1. A common procedure for plotting values that refer to intervals is to plot the point at the midpoint of the interval. A slightly refined alternative is to plot the point for an interval at the mean of the values in the interval. This has the advantage of providing an age at which to plot values for the open-ended interval, for which the midpoint is undefined. For the distribution in Table 1, the means of the values in each age interval are given by x + nax. The ndx in Figure 1 are plotted against these ages.
Figure 1 shows that many deaths occur at ages above 85 years, far too many to justify truncating the table at this age. Life tables that truncate mortality experience too early are common, partly because conventional upper age limits were adopted long ago when expectations of life were lower.
The hollow circles in Figure 1 show dx values from a life table constructed by extrapolating the age-specific death rates (nmx) in Table 1 to older ages by fitting the Human Mortality Database Log Quadratic model  and then interpolating to single years of age. These points suggest that an open-ended group beginning at no less than 100 years of age is necessary to get a satisfactory picture of the frequency distribution of deaths.
The distribution of ages at death of members of a birth cohort is distinctive in that deaths occur at particular times, as well as at a particular ages. For the study of mortality, the time is as important as the age. This characteristic of the distribution of cohort deaths by age is not shared by frequency distributions in general, and it is essential to understanding the meaning of the nqx values.
The formula defining nqx, (1c), shows that its numerator is the same as the numerator of ndx, the number of deaths in the age interval [x, x + n). The difference between nqx and ndx is the denominator, all persons in the cohort for ndx versus persons surviving to age x for nqx.
To see the significance of this difference, suppose for simplicity that persons in the cohort are all born at the same time t, so that deaths in the age interval [x, x + n) occur between time t + x and time t + x + n. What does ndx tell us about the level of mortality for this age group during this time period?
The answer turns out to be—nothing. The value of 5d60 (say) may be low because most cohort members have died at earlier ages (high mortality), or because most cohort members die at older ages (low mortality). The ndx values taken individually are not in general useful indicators of the level of mortality.
Dividing the number of deaths to the cohort in the age interval [x, x + n) by the number of survivors at age x controls for the history of mortality in the cohort by eliminating the dependence on the numbers of deaths below age x. The value of ndx cannot be larger than the number of persons surviving to age x. The value of nqx may be any number between 0 and 1, whatever the number of deaths before age x, provided only that there are some survivors at age x. This is one way of understanding the significance of nqx.
Another way is to ask what is the maximum number of people in the cohort who could die in the interval [x, x + n). The most fundamental principle of demographic analysis is that numbers of events tend to be larger when the numbers of persons who might experience these events are larger. Dividing the number of deaths by the number of persons who might have died gives a relative number that controls for large differences in numbers of persons. The influence of small differences in numbers of persons may be outweighed by other influences on numbers of deaths.
The number of persons in the cohort who die in the interval [x, x + n) is ndx times the number in the cohort. The number who might die is the proportion lx who survive to age x times the number of in the cohort. Dividing the first number by the second gives the definition in formula (1c). This is a second way of understanding nqx.
A third way to understand nqx is as a conditional probability. By definition, the conditional probability of event A given event B is
Let A be the event “died in the age interval [x, x + n)” and B the event “survived to age x”, which is the same as “died in the interval [x, ∞)”. The intersection of these two events is death in the interval [x, x + n), so nqx is the conditional probability of dying in the interval [x, x + n) given survival to age x.
To personalize this, suppose that I am a cohort member, that I have just reached age x, and that I want to know how likely it is that I will die before reaching age x + 5 years. The nqx column of the life table tells me. If I am 30 years old, the chance of dying before age 35 is 0.7%. If I am 70 years old, the chance of dying before age 75 is nearly 20%—this of course on the assumption that my mortality risks are represented by the life table shown in Table 1.
Given any one of nqx, lx, and ndx columns, the remaining two columns may be calculated. This may be demonstrated by elementary algebra or by thinking through how one would get from any one of the columns to the other two. Given the ndx column, for example, cumulating from the bottom gives the lx column and dividing ndx by lx gives nqx. The information in the three columns is redundant in this narrow sense. As descriptive statistics, the columns provide different and usful information.
Note finally that the nqx, lx, and ndx columns do not depend on the age distribution of deaths within age intervals. The nLx, Tx, and ex columns do depend on the age distribution of deaths within age intervals. This dependence is the basis for dividing the six basic life table columns into these two groups.
The values in these columns are defined as follows.
The “Person years lived” in the numerator of (3a) is the sum the time in years lived in the age interval by each person who survives to age x. This sum may be divided into two components.
(a) Persons who survive to age x + n contribute n person years to the sum. Multiplying the number of these persons by n and dividing by the total number of persons in the cohort gives n×lx+n.
(b) Persons who die at age x ≤ xi < x + n contribute xi − x person years to the sum. Multipying the number of these persons by the average of the xi − x values, nax, and dividing by the total number of persons in the cohort gives nax×ndx.
Combining these two expressions gives
This formula may be used to calculate the nLx column from the lx column if the nax column is available (noting that ndx = lx - lx+n).
Person years lived is additive over age intervals, so the Tx column may be calculated from the nLx column using
The value of ex = Tx / lx is expectation of life at age x, the average number of years lived after age x by persons who survive to age x. For x = 0, this becomes expectation of life at age 0 or life expectancy at birth. Life expectancy at birth for the cohort is the same as the mean age at death of cohort members.
Readers familiar with introductory statistics may be puzzled by the definition of life expectancy at birth in the preceding section. If it is simply the mean age at death for cohort members, why not calculate it using one of the formulas for the mean found in introductory statistics textbooks? Why “person years lived” and the cumbersome and apparently unnecessary three column calculation?
The answer may surprise readers who grew up with personal computers and spreadsheet programs. Life tables go back more than 350 years, beginning with John Graunt’s Natural and Political Observations Made Upon the Bills of Mortality, a study of statistics of deaths in contemporary London. The life table in its modern form is newer, but the columns in Table 1 go back more than a century.
This means that life tables were developed when arithmetic was done using pencil and paper—no calculating machines, to say nothing of computers. When arithmetic is done by hand, addition is simplest and least error-prone. Subtraction, to say nothing of multiplication and division, is more difficult and to be avoided if at all possible.
Calculations are therefore organized to minimize the use of subtraction, multiplication and division. This explains the seemingly peculiar manner in which life expectancy is calculated. Manual calculation also explains the layout of the life table, the first purpose of which was not to present results, but to organize the calculations in a way that minimizes the chance of careless error.
The difficulties of manual arithmetic explain the origin of the life table columns, but two other factors explain their persistence. Insurance and annuity calculations use ex values at all ages, and if these are wanted, the traditional organization of their calculation is as good as any other. It has the advantage, moreover, of providing nLx values, which turn out to be useful in connection with stationary populations and survivorship problems.
Life expectancy at birth and mean age at death for a cohort are two expressions for the same quantity, but “mean age at death” for a period might reasonably be understood as the mean age at which deaths to the population during a time period. This mean reflects the population age distribution as well as the age-specific risks of mortality and is an extremely poor indicator of mortality. “Expectation of life at birth” avoids this confusion.
The nax column shows the average years lived in the age interval [x, x + n) by persons who die in this interval. It was used in section 5 to calculate nLx from lx using formula 4.
Substituting lx − lx + n for ndx in formula 4 and rearranging terms gives
If a life table shows all three columns, this formula provides a check of the consistency between them. If a life table does not show nax values, they may calculated from the nLx and lx values.
Formula 4 also implies
Since l0 = 1, this shows that the lx column may be iteratively calculated from the nLx and nax columns.
Formulas 4, 6, and 7 together show that the information in the lx, nLx, and nax columns is redundant in the sense that any one of these columns may be calculated from the other two.
Preceding sections discussed the life table columns in Table 1 as though they were calculated from a list of the ages at death of a cohort of persons born in the distant past, so distant that all of the cohort members have died. This pedagogical device faciliates understanding of the meaning of the columns and their interrelationships.
As noted in the introduction, however, most life tables encountered in practice are period life tables. They show what would happen to a cohort of persons born during a recent time period if these persons experienced the mortality risks observed during the period in which they were born. The idea is not that mortality risks will remain in fact constant, but that this assumption provides a useful representation of current mortality risks.
Construction of period life tables involves many technicalities. The methods used vary with the data available. A thorough discussion is far beyond the scope of this letter. It will be useful, however, to present a generic procedure for calculating a period life table from a set of age-specific death rates.
Current mortality risks are represented by nmx, defined as the number of deaths of persons in an age interval [x, x + n) during a time period divided by the person years lived by persons in the age group during the time period. The nmx column in Table 1 shows age-specific death rates for Austrian males for 1992.
The simplest way to calculate a period life table is to use a formula that translates age-specific death rates into nqx values. The nqx colum values are then used to calculate the lx column values, and the lx column values are used to calculate the nqx column values.
Calculation of the nLx, Tx, and ex columns requires information on the age distribution of deaths within age ingervals. This is not provided by the nqx, lx or ndx columns and so must be exogenously supplied. A set of nax values, calculated from available data or taken from another life table, may be used for this purpose (see the discusion in section 3.2 of ).
Given a set of age-specific death rates nmx and a set of nax values, nqx values may be calculated as
A derivation of this formula is given in section 3.1 of .
 Guillot, Michelle. 2003. “Life tables” entry in Volume 2 of the Encyclopedia of Population, Ed. Paul Demeny and Geoffrey McNicoll. New York: Macmillan Reference USA.
 King, George. 1902. Statistical applications of the mortality table. Excerpted in David P. Smith and Nathan Keyfitz, Mathematical Demography: Selected Papers, second, revised edition, Ed. Kenneth W. Wachter and Hervé Le Bras. Demographic Research Monographs: A Series of the Max Planck Institute for Demographic Research. New York: Springer. Available at www.demogr.mpg.de/books/, accessed 13 July 2016.
 Preston, Samuel H., Patrick Heuveline, and Michel Guillot. 2000. Demography: Measuring and Modeling Population Processes. Oxford, United Kingdom: Blackwell Publishers Inc.
 Wilmoth, John R., Sarah Zureick, Vladimir Canudas-Romo, Mie Inoue, and Cheryl Sawyer. 2012. A flexible two-dimensional mortality model for use in indirect estimation. Population Studies 66(2):1-28.