# International Futures Help System

## Cross-Sectional and Lognormal Formulations

This section provides basic information on the two formulations used for forecasting poverty in the IFs model. The next section elaborates the lognormal approach.

### Cross-Sectional Analysis of Change in Poverty

Looking at the simpler and much less heavily used approach first, the following figure shows a scatterplot of countries for which there are data on which to build a cross-sectional formulation. The logarithmic curve fit to that data suggests that as countries reach about \$10,000 per capita at PPP, extreme poverty essentially disappears.

The cross-sectional formulation used within IFs to fill holes for countries without surveys uses the logarithmic form of the figure above (do not confuse using the logarithmic form with the lognormal approach, discussed below), but statistical analysis added the Gini coefficient to the formulation with the expected positive relationship (taking the adjusted R-squared to 0.62).

It should be noted, however, that an exponential or power curve can be fit to the same cross-sectional data with approximately the same R-squared, but which exhibits slower decline with increasing GDP per capita and has a much longer tail of non-zero poverty. The next figure compares the two quite different functions. When the exponential form is substituted within IFs, the forecasts for poverty reduction are, of course, even less positive. The logarithmic form is normally used in IFs, however, because it appears visually to capture better both the higher levels of poverty at low levels of GDP per capita and to capture the near elimination of extreme poverty by about \$10,000 per capita.

In the process of forecasting with the cross-sectionally estimated function, it is necessary to recognize initial differences between the most recent survey-based data for each country and the expected values of the function. These differences could represent a variety of forces, including patterns of government transfers or patterns of social discrimination across ethnic or caste groupings. It is impossible to know if such differences or country-specific shifts relative to the function will persist or not. IFs forecasts assume very slow erosion of the differences for individual countries from the general function over time, thus protecting the country differences (path dependencies) for many years. Specifically, the differences are captured by a multiplicative adjustment factor in the base year of the model. [1]

### Lognormal Analysis of Change in Poverty

The use of distributions in forecasting begins with the distinction between a detailed distribution and the simpler parametric representation of such a distribution. By far the most widely-used method for detailing distributions of income, wealth, or other quantities is the Lorenz curve (see again the Lorenz curve and the Gini coefficient figure here and the discussion surrounding it). Any survey data on income or consumption for a society can be shown in Lorenz curve form with essentially complete accuracy. There is a clear relationship between the Lorenz curve and the expression of shares of income held by quintiles, deciles, or even percentiles of population.

Although it would be possible simply to project forward the quintile or decile shares of a Lorenz curve to specify future income distributions, doing so would have at least two significant weaknesses. First, it would largely freeze those distributions, which can be quite dynamic. Second, it would not directly facilitate the computation of key poverty indices such as the headcount of those with less than \$1 per day.

What we want instead is an analytic representation of the income distribution that can change in form in response to both changing average income levels and changing income distributions, in turn represented by something as simple as the Gini coefficient. Moreover, we want a representational form from which we can conveniently compute specific deciles or quintiles (thereby reconstructing the Lorenz curve) and also compute key poverty measures like the headcount.

Fortunately, there are a number of analytic formulations and estimation techniques that allow us to do exactly that. The most widely-used is the lognormal formulation. Chapter 3 discussed it and portrayed it in Figures 3.2-3.3; Appendix 1 elaborated the relationship with the Gini coefficient. Although not all national income distributions have lognormal form, something very close to that form is very typical. [2]

A lognormal distribution that fully represents the distribution of income in a society can be specified with only two parameters, average income and the standard deviation of it. [3] Very usefully for forecasting purposes, the Gini coefficient can be used in lieu of the standard deviation. The Lorenz curve and standard poverty measures are then easily computable from the lognormal equation with the two specific parameters. [4]

Given its advantages, the IFs approach to forecasting poverty uses the lognormal formulation, driven by average consumption and the Gini coefficient. More concretely, the procedure in IFs requires specification and use of the general function below.

As with the cross-sectional formulation, the computed value of those living on less than \$1 per day in the base year (2000) is fit to initial conditions.

[1] Normally in IFs, “adjustment shifts” calculated in the first year are allowed to erode back to basic functional specifications over 50-100 years. For the analysis of this report, the country-specific poverty shifts were left intact over the forecast horizon.

[2] Bourguignon (2003: 7) noted that a log-normal distribution is “a standard approximation of empirical distributions in the applied literature.” He further decomposed the growth and distributional change effects in poverty reduction and explored the interaction between them.

[3] The lognormal is not the only parameterization possible of the income distribution. Other forms include polynomial functions (used by Dikhanov 2005), a generalized quadratic model (Villasenor and Arnold 1989), and the Beta model (Kakwani 1980b). Datt (1991) has derived formulations for computing the common aggregate poverty measures from multiple parameterizations of the Lorenz curve. In representing income distribution it is also possible to use non-parametric techniques, such as the Gaussian kernel density function (Sala-i-Martin 2002).

[4] Qu and Barney (2002) used the basic procedure for forecasting in the T21 model, and Kemp-Benedict, Heaps and Raskin (2002) used it in POLESTAR for the computation of malnutrition.