ISSN (Print) - 0012-9976 | ISSN (Online) - 2349-8846
-A A +A

Rethinking India’s Employment Data Architecture

Radhicka Kapoor ( is senior fellow, Indian Council for Research on International Economic Relations.

Following the recommendations of the task force set up to revamp the employment data architecture, India’s employment statistics are undergoing significant revision. The effort is largely directed at producing real-time high-frequency data in the formal sector. However, given that employment is dominated by low-wage and low-productivity jobs in the unorganised sector, this exercise will serve little purpose and only deflect attention from the enormity of the employment challenge.

The author is grateful to R Nagaraj for providing insightful comments and suggestions. I would also like to express my thanks to participants at the ICRIER-Ford Foundation, “Samwaad” on “Revamping the Statistical Architecture for Robust Data on Jobs in India—A Dialogue on India’s Employment Data.” Surbhi Ghai provided excellent research assistance.


India’s employment data architecture is in the midst of a massive overhaul. Historically, employment estimates have been generated using household and establishment surveys. The importance of analysing both these sources to get a comprehensive picture of India’s dualistic labour market, characterised by dominance of informal jobs, cannot be overemphasised. Household surveys capture both the organised and unorganised sectors, particularly the self-employed. In that sense, they largely satisfy the requirements of completeness. The quinquennial household surveys on employment and unemployment conducted by the National Sample Survey Office (NSSO), the last of which was conducted in 2011–12, have been the primary source of various labour market indicators since 1972–73. The Labour Bureau since 2009–10 has also started conducting annual household employment and unemployment surveys. Establishment surveys, on the other hand, compile data from worksites and provide a more detailed picture of the industry structure of employment and characteristics of establishments.

The key establishment surveys in India include the Ministry of Statistics and Programme Implementation’s Annual Survey of Industries (ASI) and the NSSO’s “Unincorporated Non-agricultural Enterprises (Excluding Construction).” While the ASI database is an annual one compiling information on the growth, composition and structure of “registered” or formal sector firms,1 the latter is a quinquennial survey which provides data on unregistered firms (own-account enterprises which operate without any hired worker and establishments with less than 10 workers).

As much of India’s employment data is generated with a considerable time lag and largely restricted to the organised sector, there has been an increasing clamour for more comprehensive and real-time employment data that can accurately assess the extent of job creation. To address this issue, the government set up a task force (henceforth, TF) to revamp the employment data architecture in May 2017. The report of the TF made some critical recommendations, several of which are now underway. Significant amongst these is the use of administrative data relating to payrolls, social security systems and provident fund for compiling high-frequency real-time information on the labour market. The TF makes the case for exploiting data collected from sources such as Employees’ Provident Fund Organisation (EPFO), Employees’ State Insurance Corporation (ESIC) and the Pension Fund Regulatory and Development Authority (PFRDA) to gauge the extent of formalisation of the labour force. It argues that to the extent enrolment in these databases signifies extension of safety nets to workers, it can be viewed as formalisation of jobs. 

Significant Initiative

Following up on this, the government has started reporting age-wise monthly payroll data for EPFO, ESIC and PFRDA. This is a significant initiative. Payroll databases are an important indicator of the employment scenario of a country and provide a count of formal jobs. The Current Employment Statistics (CES) in the United States (US) produce detailed industry estimates of non-farm employment, hours worked and earnings of workers on payrolls. Each month, CES surveys approximately 1,49,000 businesses and government agencies, representing approximately 6,51,000 individual worksites. The key figure—the change in the total number of non-farm payrolls compared to the previous month—is used to gauge the employment situation.

The interpretation and analysis of the payroll data released in India is, however, not so simplistic. As noted by the TF, it is an exercise fraught with challenges. First, there is significant overlap and duplication across the various schemes. We need a common identifier across the multiple data sets to avoid double-counting. Attempts are being made to circumvent this problem using the unique Aadhaar ID. While the EPFO and PFRDA data sets have been Aadhaar seeded, the ESIC data has not. This makes it difficult to analyse the latter. Second, these data sources are even more partial in coverage than enterprise surveys like ASI and their frame is likely to be incomplete. For instance, it has been observed in the power loom industry in Solapur that owners have chosen to operate via multiple small units from the same premises through different names hiring only six to seven workers in each unit to avoid paying Employees’ Provident Fund (EPF) to workers. Third, the enrolment numbers on administrative databases need to be interpreted with caution as new entries do not necessarily reflect new jobs. This can best be explained with the example of the EPFO data set. During the period September 2017 and February 2018, 32.7 lakh new additions across all age groups were made on the EPFO subscriber base.2 These additions are interpreted as being reflective of new jobs and it is claimed that 32.7 lakh jobs were created during this period. Such an assumption is indeed flawed. Given that the EPF Act applies to all factories in classes of industry specified in Schedule 1 of the act where 20 or more persons are employed, the addition of even one more worker will result in all the workers of this establishment getting added on the database. While this is a widely acknowledged caveat, there are several other reasons as to why we need to be cautious in drawing such inferences.

The last few years have seen a concerted effort to increase enrolment on the EPFO database. In August 2016, the government launched the Pradhan Mantri Rojgar Protsahan Yojana (PMRPY) to incentivise creation of new jobs in the formal sector. Under this scheme, the government pays the 8.33% Employee Pension Scheme contribution of the employer for all new employees enrolling in EPFO for the first three years of their employment. A new employee is defined as an employee earning less than (and including) ₹15,000 per month, who was not working in any establishment registered with the EPFO in the past and did not have a Universal Account Number prior to 1 April 2016. The objective of this scheme is not only to encourage employers to recruit unemployed persons, but also to bring informal employees into the books. As of 6 March 2018, a total of 26.9 lakh employees were enrolled onto the EPFO database under the PMRPY (Lok Sabha Starred Question No 2161). It would not be unreasonable to assume that a fairly large share of the 32.7 lakh additions reported in the recently released data were under the PMRPY scheme. If this is indeed the case, then these people are earning ₹15,000 or less on a monthly basis and this itself reflects the quality of jobs being created. Further, given that the scheme will be in operation for a period of three years and all new eligible employees will be covered only till 2019–20, the question of whether the programme will simply give a temporary boost to creating formal jobs and whether these jobs will be sustainable in the long run needs to be addressed. It also needs to be noted that the implementation of goods and services tax (GST) has given an impetus to formalisation and pushed firms to register on the EPFO database driving up the enrolment numbers. Additionally, having found laxity on the part of the contractors in depositing the EPF of contract workers after claiming huge sums from the principal employers, the EPFO has fixed the responsibility of remittance with the principal employer.

This has resulted in increased enrolment of contract workers on the database. Similarly, in the above-mentioned case of power loom units in Solapur, the EPFO has directed that EPF be given to workers in multiple power loom units operating under a single shed, and having the same trademark and logo, but being operated as separate units. It has formed special squads since August 2017 to carry out inspections of power loom units to check on the implementation (Indian Express 2017). Such attempts to increase the outreach of the EPFO database suggest that we need to be more prudent in identifying whether additions to the databases truly represent new formal jobs or are simply capturing better enrolment of existing employed individuals. 

In addition to introduction of payroll data, the TF has also recommended the discontinuation of the NSSO’s quinquennial employment and unemployment surveys. This is in the process of being replaced with a more robust household survey with a quarterly module in urban areas, namely the Periodic Labour Force Survey (PLFS). This move requires us to revisit a key issue plaguing our employment statistics for the last few decades. India’s household surveys produce multiple employment measures—usual status, current weekly status (CWS) and current daily status (CDS). Presently, the debate on employment is anchored around usual status measures. Given that the PFLS will produce annual estimates at the national and rural levels, and quarterly estimates in urban areas, the usual status measures may no longer be the most appropriate to examine.

Being inherently smoother, usual status measures are unable to capture the changes in activity pattern caused by seasonal fluctuations. On the other hand, as noted by the NSSO Employment Unemployment Report (1999–2000), current status measures obtained by adopting the CWS or CDS approaches reflect the overall effect caused by the intermittent changes in the activity pattern during the year. The CDS in fact reflects the changes that take place even during the week. The estimate of the employed based on CDS gives an average daily picture of employment. Thus, employment measures based on the current status approach would be more meaningful for analysing the PLFS results as compared to the usual status’s approach.

Gainful Employment

Another compelling reason for examining the current status measure is that India’s challenge is not just one of unemployment, but underemployment. A special group set up in the Planning Commission (2000) also suggested that compared to the usual status, the CDS is a better measure to estimate the unemployment gap in terms of absence of gainful employment opportunities. This is because under the usual status measure, a large number of those who are unemployed and underemployed, that is, not having any gainful employment, are in fact declared employed. This results in an overestimation of the level of employment. The International Labour Organization (1990) also notes that the choice of reference period depends on national conditions. Typically, most national labour force surveys choose a weekly reference period. However, in economies such as India where casual and intermittent work, and part-time and temporary jobs are widespread, the daily measure would give a clearer snapshot of the employment scenario.

The TF has also recommended the introduction of a time-use survey and an annual enterprise survey using enterprises registered with Goods and Services Tax Network (GSTN) as the sample frame. A subset of enterprises would be tracked at higher frequency to produce monthly or quarterly estimates. The absence of a comprehensive sample frame for existing establishment surveys is a well-known problem. It has been noted that the frame for ASI is becoming increasingly patchy as nearly two-thirds of factories employing more than 10 workers do not get registered under the Factories Act (Nagaraj 2018). It is, however, unclear whether the GSTN will in fact provide an appropriate frame to address this issue. The GSTN is an identity of an enterprise paying the indirect tax. A factory is a location-specific, employment-determined identity of a production unit. There need not be any unique one-to-one correspondence between the two. A factory may be selling its output via many enterprises to minimise its tax burden. Conversely, an enterprise may own/operate many factories whose identities could be dubious and used selectively by the entrepreneur to hide or divert profits, inflate costs, circumvent rules to maximise return on investment for the entire group. Thus, using the enterprise as the unit of analysis to estimate employment may be misleading.

The revamp of employment data needs to be borne in the backdrop of the fact that labour markets in India are dualistic. With 46% of the workforce being self-employed and less than 2% of total establishments hiring 10 or more workers, collecting data on the unorganised sector, though challenging, is of enormous significance. Whilst there is a proposal to have a separate “Annual Survey of Enterprises excluded from GSTN” for own-account enterprises and those with a turnover below ₹2 million, details have not been fleshed out. Much of the TF’s effort to revamp the data have been directed towards the formal sector. In fact, it is interesting to note that the TF draws on global practices of two developed countries (the US and United Kingdom), where the nature of employment challenge and consequently, employment data requirements are substantially different. In the developed world, the employment challenge is one of open unemployment and not disguised unemployment as in India. 

While the TF has placed much emphasis on producing high-frequency employment estimates, one does wonder whether the fetish with producing real-time employment statistics is misplaced and what the policy implications of this data would be in the Indian context. In the US, the Bureau of Labor Statistics produces monthly employment statistics that amongst other things feed into monetary policy. It is unclear if this will indeed be the case in India—whether these numbers will enable a more informed monetary policy and how effective monetary policy will be in addressing employment challenges such as informality. Given the large amount of resources likely to be expended on generating high-frequency employment data in a diverse and complex economy such as India, we need to step back and analyse what the objective and strategic value of this data is, and what the optimal frequency of employment data should be. Simply producing high-frequency data without providing insights into the quality of employment would serve little purpose.


1 These  are firms covered by Sections 2m(i) and 2m(ii) of the 1948 Factories Act, that is, those that use electricity and hire more than 10 workers, and those that do not use electricity but employ 20 or more workers. 

2 Given that the EPFO has a large number of accounts receiving no contribution, these numbers correspond to only those accounts which received non-zero contribution.


International Labour Organization (1990): “Surveys of Economically Active Population, Employment, Unemployment and Underemployment: An ILO Manual on Concepts and Methods.”

Indian Express (2017): “Squads Formed to Help Power Loom Workers in Maharashtra Get EPF Benefits,” 17 October,

Kapoor, R (2017): “Waiting for Jobs,” ICRIER Working Paper No 348.

Lok Sabha Starred Question No 2161, 14 March 2018, /245/Au2161.pdf.

Nagaraj, R (2018): “Of ‘Missing Middle’ and Size-based Regulation: A New Frontier in the Labour Market Flexibility Debate,” _Missing_Middle.pdf.

National Sample Survey Office (1999): “NSS Report No 458: Employment and Unemployment Situation in India, 1999–2000,” Ministry of Statistics and Programme Implementation.

Papola, T S (2014): “An Assessment of the Labour Statistics System in India,” International Labour Organization, Country Office for India.

Planning Commission (2000): “Report of the Special Group on Targeting Ten Million Employment Opportunities Per Year over the Tenth Plan Period.”

Task Force on Improving Employment Data (2017): “Report of the Task Force on Improving Employment Data,” NITI Aayog.

Updated On : 8th Oct, 2018


(-) Hide

EPW looks forward to your comments. Please note that comments are moderated as per our comments policy. They may take some time to appear. A comment, if suitable, may be selected for publication in the Letters pages of EPW.

Back to Top