Arjun Kumar, P C Mohanan
The daunting challenges concerning data posed by the CAA-NPR-NRC debate have been exacerbated by the COVID-19 pandemic and Recession. The reluctance of the government to release official data needs to be resolved urgently. As India unlocks, postponement of NPR and only doing the first phase of the 2021 census work will be a logical step to ensure that the census does not suffer from the limited window during which the entire work is expected to be completed.
In the light of the prevailing COVID-19 pandemic crisis, the Government of India has deferred the National Population Register (NPR) update and the inception phase of census 2021 that was scheduled to start on 1 April 2020. Given the unrelenting spread of the novel coronavirus, it is impractical and inhumane to send census enumerators on the field to collect and validate the information from the households. Generally, statistical exercises take a backseat during calamities.
In fact, the conduct of census 2021 and NPR was under considerable debate and criticism over the past year, primarily owing to the Citizenship (Amendment) Act (CAA) enacted on 11 December 2019. These became intertwined with communalizing of the issue of the National Register of Citizens (NRC), over which protests swamped across the country.
The recent years have seen a massive upsurge in demand for official data, for instance, socioeconomic indicators including those supporting the Sustainable Development Goals (SDGs), censuses, and large-scale national surveys. This phenomenal rise in the need for data is mainly because of the availability of big volume data in various forms, owing to the use of mobile communications, increased social media interactions, and digitally-enabled transactions. The use and dependence on conventional data generated by government agencies for public use have also become widespread.
Unprecedented Deferment of Census
The census operations in India governed by the Census Act, 1948, is one of the largest administrative exercises undertaken anywhere in the world. The 2011 census was the 15th census in an unbroken chain for decennial censuses beginning in 1872 and the 7th since India’s independence. The only exceptions have been the withdrawal of operations in the state of Assam during 1981 and in the state of Jammu and Kashmir during 1991, both due to law and order problems.
The actual census now comprises two phases: First, the house listing operations, wherein all houses (structures) irrespective of use are identified and listed and the information on houses, household amenities, and assets are recorded. Second, population enumeration, wherein more detailed information on each individual residing in the houses is collected and validated. The first phase takes about six months and the second is completed in a short period of three weeks to ensure more accurate counting of people.
Unlike surveys, the census asks only a limited number of questions (the form for house listing for the census 2021 has 31 questions) usually on a single/limited page. The census data are protected to ensure the confidentiality of data furnished by respondents and the individual information is not revealed for any other purposes other than aggregating to administrative domains. Another novelty of the 2021 census operations is the introduction of handheld devices for recording/capturing data from households/individuals.
This has been introduced for new information as well as updating the NPR. It is in keeping with the practice of shifting most data collection operations to electronic devices for allowing prompt scrutiny and transferring them to secure centralized servers. It is hoped that this will take care of the huge delays in the publication of census results—a feature that has proved to be a major bane in all previous Indian censuses.
Politics of Data and Trust Deficit
Controversy over the 2021 census and updating of the NPR along with the census house listing operations does not come as a surprise given the government’s open commitment to NRC. Those opposing the joint exercise, have fueled the fire to the debate and urged the people to inform false names like “Ranga Billa” with slogans of “kaagaz nahin dikhaenge” (will not show the papers). Apart from these, what worries many researchers is that this joint operation could witness instances of violence on the enumerators, and ultimately yield to a compromise on the quality of census data.
In a recent article in Economic and Political Weekly, ‘Citizenship (Amendment) Act: How do we move forward?’ (March 21, 2020), Kundu and Mohanan argued the need to go beyond politics to address various concerns including technical and operational questions raised by the linking of the census and NPR. The article points out that the population census with which the NPR is tagged suffers from a serious coverage error by missing out 2.3% of the population, as determined through the post enumeration checks in 2001 and 2011.
The NPR tagged to the house-listing schedule in 2011 prepared the electronic database for only 1,180 million persons, against the census figure of 1,210 million. Considering the undercount of another 30 million in the census itself, it is evident that 60 million people were missed in this NPR exercise. Given the present level of decadal migration, roughly 12% of the people would not be found in the place where they were enumerated in 2011.
The important question would then persist on the utility of the NPR data that is collected at enormous physical and financial costs when Aadhaar-based biometric identification numbers having a higher coverage and linked to almost all welfare programs are available. The NPR is certainly not required for targeting purposes. Therefore, the utility of the NPR database is yet to be ascertained for any said purposes.
The large-scale exclusion that is inevitable in the NPR exercise will therefore be a nightmare for all usual residents of the country. Given that the government has not revealed the existing NPR to public scrutiny, any statement on its accuracy makes it a matter of conjecture.
The apprehension that the NPR would be the base document for NRC would compromise the reliability of the 2021 census data. Recent migrants who are likely to have difficulties in obtaining identity or address proofs at their current place of residence would not identify themselves as migrants. There have been suggestions to delink the NPR exercise from the census operations. In such a situation, concerns about the loss of economy of scale, achieved in a combined exercise are being raised.
Amid the current challenges in dealing with the pandemic caused by COVID-19, the government has justifiably postponed both the census and NPR operations. Coupled with other controversies relating to major government data, this decade looks disastrous and warrants urgent response to uphold the integrity and credibility of India’s statistical institutions, which have hitherto enjoyed worldwide credibility and appreciation.
Political Economy of Official Data
The experiences on the official data from recent years suggest that government itself does not accept or believe in the various official statistics, owing to the failure of the institutions to produce a credible report on account of method, comparability and interestingly adhering to the consistent macro story.
Obviously, the findings from these official statistics portray a grim picture in the country and become a sham for the government. There is no sector, whether it is employment, consumption, or sanitation, which is untouched by questions on the credibility of their data, especially by the Government, NITI Aayog, and departments.
When blamed for the declining growth of the economy, on several occasions the government has produced counter-facts from other sources of non-traditional indicators/statistics like the MUDRA, Foreign Direct Investment, Jan Dhan Yojana, Goods and Services Tax (GST), EPFO, and so on. Although the concern over unavailability and quality of the key statistics like employment and unemployment, unorganized sector, migration, etc have been around for several years, not much has been done to fix this lacuna. This demonstrates an official complacency in having a robust statistical system.
Despite the criticisms, there can be no denying that the official economic statistics continue to remain the most valued, appropriate, rich, and reliable for informed research and policymaking, nonetheless, these surveys do not give estimates at the district level and therefore are not used by implementing agencies at the ground. Experts and researchers have unequivocally demanded that despite the methods and results, the data and report must be released so that further analysis could be carried out for course correction.
For statisticians, experts, and the entire system involved in the elaborate official process of sample design, data collection, and reporting, the new condition of making the results congruent to the macro story by matching the numbers has become a norm. This has inevitably led to concealing and postponement of reports, and, often being released with multiple disclaimers virtually rendering it unusable and incomparable.
This has been further amplified by the blatant display of reluctance, concealing and withholding of administrative data—dashboards of the program MIS data (physical and financial progress, monthly reporting), which is often managed by the big private consultants as project monitoring units, having “corporate style target achievement attitude” in all the departments and ministries. This has been in trend for the past few years and has dwarfed the possibility of detailed analysis or research on these data. In the absence of credible statistics, the reasons for the recent slowdown of the Indian economy cannot be attributed to either business or seasons or structural or natural, or cyclical factors.
The callousness of the bureaucratic machinery in acknowledging the real reasons behind the phenomenon leads to flattening the growth prospects and speculation and loss of trust and confidence in the economy and government.
The era of Industry 4.0, AI, blockchain, and the gig economy has further accentuated the need for new forms of data. In the business as usual scenario, it will be difficult to adapt to the needs of changing methodology. This is because information and communication technology (ICT) will be extensively used in the days to come with the requirements of new sources of data, for example- payroll data, night lights data, GIS data, mobile phone, and big data. This calls for an urgent revamp of the entire gamut of the statistical system and making it cohesive to the objectives of “Digital India”.
Amid the gloomy statistical situation, it is important to highlight the commendable contribution of the Ministry of Statistics and Programme Implementation (MoSPI) in producing the Swachh Survekshan 2016 rapid assessment survey to understand the impact of the Swachh Bharat Mission (SBM). This demonstrates the caliber and the capacity to produce robust data and the implicit potential of an empowered statistical architecture. While several outcomes of the Swachh Survekshan can be contested, the timeliness adhered to understanding the implications of the government scheme cannot be underrated. When such a proactive approach of MoSPI is exhibited, the program implementation part of it demonstrates vibrancy and dynamism.
One of the key initiatives in reforming the Indian statistical system was the setting up of the Rangarajan Commission in the year 2000. Though this now looks like part of history, some of the outcomes of this report are reasonably fresh. In its report submitted to the government in 2001, the Commission had recommended the establishment of a permanent National Statistical Commission (NSC).
The NSC would be responsible for formulating policies, priorities, and standards on statistical matters. However, in the absence of any legislative framework, the NSC has faced challenges in bringing effective reforms. Considering this, the present government prepared the Draft National Statistical Commission (NSC) Bill 2019 to adhere to the Rangarajan Commission’s recommendations in totality, which can be disputed.
The intervening period of the establishment of NSC and the draft NSC 2019 has witnessed conflicts between the NSC and the office of the Chief Statistician of India (CSI)—who apart from being the secretary to the MoSPI is also the secretary
to the NSC.
Exaggerated by a leadership crisis, the declining credibility of the Indian statistical system has rendered several reform proposals moribund. The peak of this conflagration was witnessed in recent years, as the government influenced the publication of various NSSO reports and junked the findings citing methodological concerns.
Prior to this, in 2018, the MoSPI had invited public comments on Draft Policy: National Policy on Official Statistics. The news that this policy can be announced soon is also doing the rounds. MoSPI spearheaded a Five-Year Vision 2019–2024 for the Transformation of the National Statistical System and highlighted that the official statistics are a public good and an essential part of the development architecture of India. It extensively uses digital technology to provide holistic and coherent data on a real-time basis and is committed to reforming the existing institutional, organizational, and technical challenges for policy and stronger dissemination practices for the public.
The NITI Aayog released a National Data and Analytics Platform (NDAP) Vision Document in January 2020. Proposed to be launched in 2021, it is a flagship initiative that lays down the aim of democratizing access to public government data through a world-class user experience by standardizing data, providing flexible analytics, and ensuring ease of accessibility of data.
There have been several initiatives in the recent past to organize hackathons and competitions to crack big data, and so on. However, in doing so, the enormity of the purpose of making statistical architecture more open and credible must not be forgotten. This is because official data is used primarily by researchers and policymakers, and hence it becomes important to instill trust and confidence between the two so that experts/specialists committed to the cause of harnessing official data for the greater good are effectively utilized by the government.
Amid the challenges posed by COVID-19, India must take a cue from the experience of many countries in tackling data-related exigencies (with similar concerns of citizens registry). One such country is the United States (US), where the Census 2020 is currently in process and the census operations have been made completely online catering to the dynamic and responsive needs, ensuring accountability and credibility, and most importantly making the exercise open, inclusive and involving the citizens in these tough times. No wonder, the census exercise in the US looks more like a celebration in the social media where citizens are proudly enrolling and actively participating, whereas, in India, the whole exercise displays apprehensions and a policy vacuum.
The Census operation in the USA is almost over and some extra time has been allotted to the Census Bureau, owing to the COVID-19 pandemic, for fixing various errors and ensuring data quality. China has started its decadal Census 2020 on time to enumerate the largest populous country in the world, having effective control over the pandemic situation.
There is no scope to lose any further time and the existing initiatives by the Government of India to harness ICT such as Pro-Active Governance and Timely Implementation (PRAGATI), Digital India, and JAM Trinity to use MIS and dashboards, tax data, etc needs to be leveraged to undertake the historic exercise of census 2021 and to simultaneously restore the credibility of the statistical institutions.
We contend that the current pandemic presents an enormous opportunity to utilize real-time information and data to combat and contain the coronavirus. The use of the internet of things (IOTs), phones, ICT, locations data, and so on, and the potential of aggregated data using cutting edge technology for contact tracing and surveillance mechanism to combat the coronavirus spread and enforcing the necessary social distancing and isolation as well as enabling the functioning of health care delivery, economy, society, and welfare, has been paramount.
Understandably, the civil society and citizens who were skeptical of the enormous individual privacy (rights) concerns associated with the role of state and private sector over the use of such data, now welcomes these measures to be adopted as early as possible by the government and tech companies. During this time of crisis, the use of technology and data (harmonization from various sources—medical, location data, contact tracing, administrative data, mobility, phone, social media) has become more of life support, especially in the lockdown, which has serious implications for the fight against coronavirus.
For now, the fallout of continued state control over data and surveillance remains a valid yet distant worry. Nevertheless, the urgent requirement of mandatory surveillance of each citizen in the unlock phase to enforce social distancing, isolation to trace and treat the virus, and also to ensure delivery of essential welfare and including health and medical facilities, warrant a full-fledged handholding digital literacy program under Digital India. This would help in utilizing the pandemic-induced work from home and more personal time benefitting each citizen.
We are certain that each citizen, community, private sector, and civil society will provide enormous constructive support. Thus, universal enumeration and registry (big data and real-time statistics) exercise by the government during the pandemic will be worth attempting (the Aadhaar experience being bitter and unsatisfactory). In fact, the time for ensuring universal digital literacy, learning, and application is now or never. Such a universal digital India push, in a post-COVID-19 crisis situation, can lay the foundation of real-time data and modern architecture envisaged for “New India.”
The National Digital Health Mission (NDHM), was announced by the Prime Minister on this Independence Day, August 15, 2020, to tackle the pandemic and well-being of citizens, harnessing digital and health technology towards #AtmaNirbharBharat. Under NDHM, Unique Health ID will be provided to every citizen which will have details of the diseases, diagnosis, report, medication, etc., in a common database through a single ID. However, the access, experiences, and effectiveness of the Ayushman Bharat and Pradhan Mantri Jan Arogya Yojana (PMJAY) scheme in the delivery of health services has been limited, and therefore, the timely and scale implementation of the NDHM has to be monitored and act upon.
The Aarogya Setu mobile app, an open-source COVID-19 contact tracing, syndromic mapping, and self-assessment digital service which was developed by National Information Centre (NIC) was launched by the government in April 2020 and now has whopping more than 158 million users (as of September 2020).
With the Indian statistical system being severely underfunded and understaffed, large investments in financial and human resources are urgently needed to empower the statistical system. In the fight against coronavirus, under a post-national lockdown, unlock, containment zone lockdowns as we deal with the pandemic situation, lessons on the importance of data and statistical systems, and the need for a robust ecosystem for innovation and accountability have been imparted.
In the absence of important statistics or an updated registry of most vulnerable and excluded sections such as floating migrant population, workers in the unorganized sector, underemployed and unemployed who are most affected by the pandemic, are excluded from any support from the government. A real-time and dynamic database/registry for unemployed, migrant, and informal sector workers would help immensely to tackle the crisis in the medium term. In this regard, the National Migrant Information System (NMIS) – a central online repository on Migrant Workers has been developed by National Disaster Management Authority (NDMA) to facilitate their seamless movement across States.
Further, their skill mapping and linking with the industry requirements through online job portals are being done by various state governments for possible re-skilling and harnessing the opportunity of ICT for employment matching. However, during the winter session of the parliament, the government reported that it has no data on migrants, though it acknowledges that more than 10 million migrant workers took shelter in various camps and shelter homes run by the government, civil society organizations, and private employers. The indicative estimates of the government as reported several times on the quantum of migrant workers is about 80 million in the country, which is quite higher than the estimates of several researchers.
The census is one exercise that provides migration data at the district and town
levels. India does not have any proper intercensal estimates of the population except the population projections attempted by experts assembled by the Registrar General of India, a job earlier done by the Planning Commission. Organizing an intercensal survey as done in many countries can mitigate this deficiency in the Indian data system to a large extent.
Since there are several changes and reform measures being undertaken in the country, and as the country is progressing towards the SDGs, we propose holding the census every five years instead of the present decadal system. It will keep the information database updated and vibrant. Informed policy planning and decision-making can be ensured through this change.
The road to improving the credibility of official statistics would take a long time and much of that depends on effective leadership. Disentangling the overlapping functions of various statistical bodies and creating coherence and cohesion in the duties and responsibilities along and empowering the NSC are the best steps forward to ensure credible, accessible, and legible statistics.
The system of official statistics in India now faces the challenge of adapting to ICT at par with global standards, and therefore, the emphasis must be on strengthening our statistical institutions to enable evidence-based policymaking, planning, and most importantly implementation. This will go a long way in realizing the vision of “New India” and a US$ 5 trillion economy.
The daunting challenges concerning data posed by the CAA-NPR-NRC debate have been exacerbated by the COVID-19 pandemic and the reluctance of the government to release official data. This must be resolved urgently.
Further, postponement of the first phase of the 2021 census operations and the data collection for preparing the NPR indefinitely is sure to upset the timelines for the population count in March 2021. This is because a lot of intervening activities are required to be performed after the house listing before the population enumeration.
The postponement of NPR and conducting the first phase of the 2021 census work will be a logical step to ensure that the census does not suffer from the limited window during which the entire work is expected to be completed.
Lastly, to seize this as an opportunity to become the “Vishwa Guru”, the NSC comprising of eminent professionals from various fields of practice, should be established as an independent and dynamic body to ensure the availability of credible, accessible, and legible statistics to empower evidence-based policy and impact.
To see more data-related insights from IMPRI, please visit GenAlphaDC.com, Generation Alpha Data Centre South Asia, a cutting-edge next-generation startup Data Centre at the Impact and Policy Research Institute (IMPRI).
Disastrous Decade for Data
Read more by Arjun Kumar and P C Mohanan at Exodus of migrant workers from Delhi – where is the data to help foresee it?