‘Two Class’ Distribution of Income in India
Results from the 2012–13 income tax data reveal that Indian incomes, particularly high salaries and wealth-related income, are statistically distributed into two distinct classes. The bulk of the reported income distribution can be explained by an exponential distribution, while a small fraction at the top follows a more unequal power law (Pareto) distribution. This distinction has important implications for inequality, and provides a point of comparison with similar statistical regularities observed in rich countries.
Information on tax returns by individuals, families and businesses has recently been released by the Central Board of Direct Taxes (CBDT). Drawing on the data, this article studies the shape of the distribution of reported incomes by individuals in India during 2012–13. Since independence, there was a tradition of regularly producing these national statistics under the All India Income Tax Statistics (AIITS), but this practice stopped around 1999–2000. The recent data released, although limited to a single fiscal year, sheds substantial light on the concentration of income amongst top earners as well as wealth-related sources such as interest, property, business and capital gains. The granular breakdowns of income thresholds provide important information about the distribution of individual Indian incomes, consistent with historically observed statistical regularities.
The general trend amongst economists and statisticians interested in the distribution of income is to use survey data, in the case of India usually through the various rounds of the National Sample Survey. These survey data can then be used to compute inequality measures such as the Gini or Theil coefficients. However, surveys tend to be top coded or do not receive adequate responses from extremely high-earning members of the society. The emerging and recently popularised alternative is to use administrative data, such as reports of the Income Tax Department. These high quality data are cheaper to collect (compared to expensive surveys) since people are required to automatically report their incomes to tax authorities.
Originally initiated by Simon Kuznets in the United States (US), today the effort to measure the shares of top income earners using administrative data is being led by the World Wealth and Income Database. For India, Banerjee and Piketty (2005) made the first systematic effort to measure top incomes between 1922 and 2000 using AIITS.
With the data on hand, income intervals can be used to compute the relative frequency of tax filers and thereby construct the shape of the income distribution. Typically, the complementary cumulative distribution function cCDF = P(Y>y) can offer insight into the upper tail of the distribution, that is, the proportion of filers above a given income threshold (y).
Empirical quantities often cluster around typical values (for example, the height of males in an urban city), which make it useful to completely characterise distributions using just the mean and variance. The upper tail diminishes rapidly as values deviate a few factors above the mean. This, however, is not a necessity in economic variables and heavy-tailed distributions are very often found in income, wealth and rates of return to stock market activity. For example, the incomes of all males in an urban city could have wide-ranging values depending on professions, skills, education, etc.
Wealth distributions in particular have a long tail, extending perhaps to hundreds and thousands of factors away from the typical range. Power laws are often used to characterise such distributions so that the probability is inversely proportional to a fixed power of the other (for example, P(y)= y–n). Unlike thin-tailed distributions, the probability of an extreme observation decays slowly. The Pareto distribution is one such heavy-tailed power law, which originated in economics and was used by Vilfredo Pareto to describe the concentration of landownership amongst a small proportion of the Italian population.
In this article, I show the “two class” structure underlying Indian incomes reported to the tax authorities in the 2012–13 assessment year. This class structure is not the usual social class category, such as labour, capitalists, and entrepreneurs, but rather (and relatedly) applies to the “shape” of the Indian income distribution. The bulk of individual incomes follow a thin-tailed exponential distribution which has a well-defined mean and dispersion along with the particular property that the mean and standard deviation are the same. Thus, higher the mean, more is the degree of disparity. Within this class, the probability of a person reporting income greater than a few deviations from the mean decays rapidly because the probability decays exponentially (P(y)= e–y). Salaries reported by 96%–97% of the individuals can be well accounted for by this distributional law, and the theoretical Lorenz curve formula agrees with the data.
The upper tail of salaried income (or the top few percentiles of the salaries class) and nearly all wealth-related income systematically deviates from this law and corresponds to the Pareto distribution (or a power law). The estimated parameter signifies infinite variance, implying extreme incomes carry a significant (in population terms) probability. Within this group, we may see (and so does the data suggest) that individuals report hundred or thousand times the corresponding income threshold with significant probability. In terms of income concentration, Pareto or power laws exhibit scale-free properties or “self-similarity.” For example, if 40% of income is distributed amongst the top 10% of individuals with a Pareto law, then the top 10% of the top 10% will own 40% of the income share of the top 10% itself, and so on.
Studies by Dragulescu and Yakovenko (2001) and Silva and Yakovenko (2004) have found similar results for wealth and incomes in the United Kingdom (UK) and the US, applicable to the top 1%–3% of incomes. In a similar tradition of applying statistical laws to India, Sinha (2006) and Jayadev (2008) report evidence for power-law tails for high net worth individuals using Forbes 400 and survey data, respectively.
The findings of this article are, to the best of my knowledge, a first attempt to approximate the shape of Indian income distributions across a large cross-section of tax brackets. I present summary statistics of reported incomes by category of incomes. Although the direct tax base is very limited, it covers about 30 million individuals over a vast range of incomes. Property and capital incomes are almost entirely concentrated amongst a small fractile (about 1%–2%) of the reporting population. By appropriately scaling income levels, identifying the exponential and power law characteristics of incomes becomes straightforward.
The implied statistical regularities suggest a two-class structure that differentiates Indian incomes, similar to the case for advanced economies such as the US or the UK. The properties of these distributions are important for income inequality in the formal sector and amongst those that report their incomes to the tax authorities. To gather inference on these highly aggregated complex systems, I conclude by drawing on the principle of maximum entropy,1 that is, the most likely income distribution that emerges once certain possibilities are ruled out by knowledge we already possess such as average incomes or the growth rate of incomes.
Exponential and Pareto Distributions
Logarithmic rescaling is commonly used to graphically assess the exponential or power law features of a probability distribution. If income (y) is exponentially distributed, then it has the probability distribution function (PDF) P(y)=(1/T) e(–y/T) where the parameter T is mean income.
The complementary cumulative probability distribution (cCDF) which gives the probability of persons reporting income above y is C(y)=e–(y/T) and becomes parameter-free for normalised income r = y/T. Logarithmic transformation of C(y) results in ln C(y) = –(y/T) = –r, thus the empirical distribution should collapse around a downward sloping linear trend on a vertical log scale (log probability). The mean (or expected value) T equals the standard deviation.
A typical measure of income concentration is the Lorenz curve which depicts the cumulative share of population on the horizontal axes and the cumulative share of income on the vertical axis. For a purely exponential distribution, Dragulescu and Yakovenko (2001) calculated this relationship as
A = B+(1–B) ln (1–B)
where A is the cumulative share of income and B is the cumulative share of population.
For a Pareto distribution, the probability of observing very high incomes decays slowly relative to exponential distributions and the cCDF is given by C(y) = (y/ym)–α, where α is the so-called Pareto coefficient and ym is minimum income to which the Pareto distribution applies. The empirical scatterplot of a Pareto distributed variable will continue to demonstrate a heavy tail on a vertical log scale (log probability). On the log–log scale, the cCDF transforms to ln C(y) α ln (y), resulting in a downward sloping linear trend. Low values of α imply higher inequality and the mean is defined for α > 1, while the variance is only defined if α > 2.
Data and Results
The Income Tax Department published its return statistics in a similar manner to previous AIITS reports. The number of filers are listed for various income thresholds and pre-processed into income “bins” with corresponding total and average income within the bin. Income tax returns are reported for different categories—all taxpayers, individuals, Hindu Undivided Families (HUF), firms, association of persons (AoP) or body of individuals (BoI) and companies.
These categories, to an extent, reflect the various institutional sectors underlying national accounts. However, for the purposes of distribution, the most relevant category is individual taxpayers. The HUF section is also important, but a complete analysis needs to encompass inheritance and allocations within family members (besides the fact that this category is more important in measurement of wealth concentration and inheritance). Other categories reflect the economic activities of firms and businesses and not wage-earners.
A cursory look at the number of returns filed in 2012–13 by individual taxpayers highlights the low tax base, no doubt exacerbated by the limited labour force in the formal sector, tax evasion as well as high tax exemptions. As many as 17 million individuals reported no salary income, while 11.7 million individuals reported at least some salary income. Other income categories are more closely related to the ownership of assets which generate capital income (property, equities, businesses, fixed income claims, etc). In these income sources, the proportion of filers who report at least some positive income is much lower, since wealth is more concentrated than income. Table 1 summarises these statistics by income category.
As Table 1 indicates, salaries- and business-related income are the only two categories where a significant number of filers reported positive income (that is, at least ₹1). All positive capital- related income—property, long-term and short-term capital gains, interest income—are reported by a very small fraction of the total filers. In fact less than 0.2 million individuals reported long-term positive capital gains. Although there is no microdata to determine the exact shares in these reports, it is quite evident that capital gains from a positive revaluation of existing assets are extremely concentrated.
Table 2 shows the number of people above major income thresholds. We see that salaries are the most broadly applicable category—over 35% of individuals reported income above ₹1.5 million. By contrast, only 0.24% reported the same income from short-term capital gain earnings. A very small fractile (0.00065%) earned salaries of over ₹100 million, although almost double that proportion of persons reported similar earnings of long-term capital gains. Given the paucity of returns filed, tax evasion and under-reporting is quite likely although many newspapers and dailies in their assessment of this report have incorrectly confused stocks and flows2—the numbers filed in these reports only list flows of income.
Moreover, capital incomes are extremely concentrated. While 50% of total salaries accrued to 11% of total individuals, over 80% of property income is received by 1%. As Table 2 shows, for long- and short-term capital gains, almost all income in this category was made by the top 0.5% to 1%. This is not exactly surprising since wealth is likely to be very concentrated, hence income drawn from returns on wealth (interest or capital gains) will also flow to very few persons. The number of filers reporting billions of rupees of annual income tend to range in the hundreds for salaries and business income. For capital gains and interest income, the corresponding number of returns are at most in single or double digits.
Shape of Income Distribution
Figure 1 plots the cCDF for various incomes (salaries, property incomes, business, capital gains and interest income) over the entire range (₹1 to ₹5 billion) on the log–log scale, for proportionality. The vertical axis indicates the cumulative “probability” of tax filers above the income threshold, which is measured on the horizontal axis. Given that a “flatter” slope indicates more inequality (the tail is longer), two noticeable trends emerge.
First, salaries and business incomes are relatively less concentrated as opposed to capital incomes (interest, property and capital gains). Salaries are a broadly applicable category while many small business owners (such as shopkeepers, self-employed professionals) are counted in business incomes.
Second, for salaries the tail diverges into a straight line at ₹1 million. To the left of this threshold, the datapoints cluster around the range ₹0.1–₹1 million with decaying probability (a downward slope decreasing at an increasing rate). Figure 2 isolates the right tail for wealth-related income categories. Depending upon the income category, this tail captures the top 2%, 0.3%, 0.2% and 2% of the population for property incomes, long-term capital gains, short-term capital gains and interest incomes, respectively. The chart is log–log scaled and the points collapse into a straight line as predicted by the Pareto law.
‘Two Class’ Salary Distribution
Since salaries have the broadest application, the entire range is plotted in Figure 3 on the log–log scale. The exponential trend (bowed) fits the income range leading up to ₹1 million (the point which bifurcates the Pareto and Exponential), and the Pareto portion falls on the straight line. The exponential fit captures about 96.3% of the salaried population. In Figure 4A, the exponential bulk is separately plotted on the vertical log scale, in accordance with the logarithmic transformation of an exponential distribution, the datapoints scatter on a downward sloping straight line. Figure 4B comprises the Pareto tail for salaries with an estimated slope of -1.913 (the Pareto exponent α = 1.913) and thus, the distribution has no finite variance.
In both cases (Figures 4A and 4B), the fits are good, at R2 = 99 for the exponential bulk and R2 = 99.8 for the Pareto tail. Remember that the slope for salaries is steeper (high absolute value of the exponent) than the other income categories, therefore, the variance cannot be defined either for the capital gains, interest, etc.
Figure 5 (p 62) computes the Lorenz curve for persons who fall within the exponential category side-by-side with the theoretical formula for a purely exponential distribution (A = B+(1–B) ln (1–B)) with the former explaining 98% of the values in the latter.
This article has shown that statistical regularities seen in the distribution of incomes in international and historical perspective are found in reported incomes for India in 2012–13. A complete description of income disparities can be explained as a combination of (at least) two different kinds of probability distributions. In a fictional experimental sense, it is as if individuals with different skills and a few members of India’s Forbes 400 have been packed into the same room. This is the core of the two class theory of income distribution—people’s income generating process determines their distribution.
Although data is limited, one way of rationalising these statistical regularities is on the basis of the maximum entropy principle. This is a common way of describing complex aggregated systems (such as energy levels of gas particles in a cylinder) in statistical mechanics and Bayesian inference (Jaynes 2003). A seminal paper by Foley (1994) argues that complex economic systems tend to display statistical (or maximum entropy) equilibriums rather than general (or market clearing) equilibriums.
Interestingly, the development of utility maximisation in economics is closely linked to the principle of maximum entropy in physical systems.3
If any variable has the probability distribution function (c), then its entropy is simply – Σi f (xi) log f (xi). This line of reasoning argues that the maximum entropy distribution of any complex system is the distribution that can be achieved in the largest number of ways (or combinations), whilst satisfying some prior knowledge or information about the system (such as the expected value). Maximising the entropy subject to known constraints will give the macro-level outcome corresponding to the most number of micro-level combinations. Almost every probability distribution can be easily derived from this line of reasoning. Thus, for example, the income distribution amongst persons most likely to be observed can be derived by maximising entropy constrained by some known income statistic such as mean and (or) variance of income.
The maximum entropy distribution under a constraint only on the arithmetic mean of a non-negative variable x results in the exponential form f(x) e−l.x, where l is the weight (or in economics, the shadow price) on the constraint.4 Silva and Yakovenko (2004) argue this class to be in “equilibrium inequality” because decentralised exchange in capitalist economies distributes the conserved variable (wages or salaries) exponentially.
A Pareto distribution is the maximum entropy distribution when a constraint is placed on the expected value of the logarithm of the variable (the CDF has the form f(x) x−l). This can be rationalised through the income source for individuals falling into the Pareto class (top salaries, capital gains, investment income). These incomes are closely related to the rate of return (where the logarithm implies proportionality of returns), either through the fates of wealthy portfolios or compensation (salary) dependent on profitability management.5
Pareto distributions also result from multiplicative shocks, so that changes in the variable are proportional to its value. These shocks may be a windfall gain such as when entrepreneurs’ salaries are linked to equity and a new IPO (initial public offering) is announced, or a coal mine owner enjoying a productive year or a real estate mogul profiting from a spike in real estate prices. The Classical or Marxian conjecture of capital seeking (and turbulently equalising) the rate of return across sectors holds resonance. The Pareto class may be defined as a relative capitalist class, while the exponential bulk consists of wage earners.
To conclude, India’s income distribution statistics have strikingly similar statistical properties as those observed in rich and advanced economies. The expected value of incomes for all but a small percentage can be described with one parameter (mean income). For the remaining few, the distribution has no well-defined measure of variance and invites a sharp disparity of incomes.
The divergence of the upper-income tail at the 97th percentile and the concentration of wealth-related incomes are in close agreement with the finding of Silva and Yakovenko (2004) for the US. Naturally, the question arises: why do we find similar statistical properties for two very different economies? Such political economy aspects of these findings are important, although we need to aggregate more years of data to establish persistent facts.
There are still challenges in fully applying this kind of analysis to India. First, the tax base is very low, hence not many people report incomes to tax authorities in the first place. Second, the pervasiveness of tax evasion and black money sources (particularly in property and real estate) needs accounting, to gather a more complete picture of income inequality. Finally, the CBDT should regularise these reports and also start reporting similar statistics on collection of wealth and property taxes. This will allow economists and social statisticians to start similar studies on the distribution of Indian wealth, about which very little is known.
1 Also known as the Principle of Insufficient Reason. See Foley (1994) for a discussion of these issues in economics.
2 The wealth of high net worth persons (such as the many dollar millionaires) pertains to a stock. It is only when such people earn returns on these assets, will such wealth result in income flows.
3 See Mirowski (1992) for the related history of neoclassical economic theory and physics.
4 In statistical physics, this exponential distribution is known as the Gibbs–Boltzmann distribution.
5 For an exposition of the power law wealth distribution based on maximising entropy, an excellent proof is given in Castaldi and Milakovic (2007).
Banerjee, A and T Piketty (2005): “Top Indian Incomes, 1922–2000,” World Bank Economic Review, Vol 19, pp 1–20.
Castaldi, C and M Milakovic (2007): “Turnover Activity in Wealth Portfolios,” Journal of Economic Behaviour and Organization, Vol 63, pp 537–52.
Dragulescu, A and V M Yakovenko (2001): “Exponential and Power-law Probability Distributions of Wealth and Income in the United Kingdom and the United States,” Physica A: Statistical Mechanics and Its Applications, Vol 299, pp 213–21.
Foley, D K (1994): “A Statistical Equilibrium Theory of Markets,” Journal of Economic Theory, Vol 62, pp 321–45.
Jayadev, A (2008): “A Power Law Tail in India’s Wealth Distribution: Evidence from Survey Data,” Physica A: Statistical Mechanics and Its Applications, Vol 387, pp 270–76.
Jaynes, E T (2003): Probability Theory: The Logic of Science, Cambridge: Cambridge University Press.
Mirowski, P (1992): “More Heat Than Light: Economics as Social Physics, Physics as Nature’s Economics,” Cambridge: Cambridge University Press.
Silva, A C and V M Yakovenko (2004): “Temporal Evolution of the Thermal and Superthermal Income Classes in the US during 1983–2001,” Europhysics, Vol 69, pp 304.
Sinha, S (2006): “Evidence for Power-law Tail of the Wealth Distribution in India,” Physica A: Statistical Mechanics and Its Applications, Vol 359, pp 555–62.
EPW looks forward to your comments. Please note that comments are moderated as per our comments policy. They may take some time to appear. A comment, if suitable, may be selected for publication in the Letters pages of EPW.