Empowering India’s Linguistic Diversity through Innovation
Policy Update
Urvashi Singhal
Introduction
India is a multilingual country, with 22 official languages and 12 scripts. In India only about 5% – 10% people know English and the rest are deprived of the benefits of advances in information technology. The benefits of information technology can reach the common man only when software tools and human-machine interfaces are available in people’s languages. Technology Development for Indian Languages (TDIL) Programme of Ministry of Electronics & Information Technology , Government of India is an on-going Research Programme with the following Mission statement:
- Proliferation of Language Technology
- Research and Development of Language Technology
- Development of Standards related to Language Technology
Vision statement: Digital Unite & Knowledge for All.
The Constitution of India in its Eighth Schedule lists 22 languages that cover almost 97% of the population of India. According to a study (Uma Maheshwar Rao, 2017), 82.73% of the GDP of India (2004-5 at Constant prices) comes through the use of Indian languages. The study desires that language planning in India should be based on the empirical facts of this kind. National Educational Policy of 2020 comes with a prodigious assurance, a welcome relief indeed to the population of India.
An Indian Institute of Translation and Interpretation (IITI) will be established. The IITI shall also make extensive use of technology to aid in its translation and interpretation efforts. (p. 55, NEP2020). To summarize, NEP2020 proposes mother tongue/home language instruction in primary education, and the use of technology to enable young children with various linguistic backgrounds to access high-quality learning material both spoken and in the written form.
Functioning
Research and development of technology for Indian languages is largely promoted by TDIL.
- Unification of encoding standards: UNICODE encoding of Indian language Scripts has become the Standard encoding in all the Language Technology standardization for Indian languages
- The Barrier of Script: Uniform script conversion across all the Indian scripts has become a common resource; Thus it enables to read and write any Indian language in any script.
- Mobile Applications: Use of Indian language applications on mobile has appreciably enhanced our communication.
- Corpora Initiatives: Digital corpora: 3million word sized Monolingual, raw, enriched corpora of various sizes are available in a number of major and minor languages.
- Lexical data bases: Electronic/digital lexica for Indian languages involving major Indian languages and English have been developed as part of various projects.
Performance
Over the last three decades, TDIL has achieved a lot:
- Machine Translation Systems: TDIL has supported systems like Anuvadaksh and Sampark that translate between Indian languages and from English to Indian languages. These tools have been integrated into platforms like the National Digital Library and e-Gov services.
- Digital Resource Availability: Through initiatives like BharatiyaBhasha and TDIL Data Centre, massive repositories of linguistic resources have been created and shared with researchers and developers.
- Collaborative Projects: TDIL has nurtured a rich ecosystem of collaboration. Institutions like IITs, IIITs, CDAC, and other universities have partnered to create tools that serve both government and public sectors.
Impact
- Speech to Speech Machine Translation (SSMT) system for major Indian languages: It would be possible with minimal human involvement but the involvement would decrease with the time as the machine learns on its own. Adaptation of the system in the domains within broad areas of science & technology, education, healthcare, governance, law & justice, etc.
- Text-to-Text Machine Translation system for major Indian languages: As in the case of SSMT system, it would be possible with minimal human involvement but the involvement would decrease with the time as the machine learns on its own. Adaptation of the system in the domains within broad areas of science & technology, education, healthcare, governance, law & justice, etc.
- Deployment of language technology-based applications through more than 100 start-ups.
- National Platform for Language Technology – offerings of linguistic resources and language technology-based services at competitive cost which will decrease progressively. Five-fold increase in content in Indian languages on the Internet.
- A portal for showcasing technology and carrying out assessment of technology.
- Hackathons and grand challenges in the area of language technology .
- Repository of standards, best practices and benchmarks for Indian languages.
Emerging Issues
Despite the progress, several challenges persist:
- There is a data Scarcity for Low-Resource Languages as there are many Indian languages and dialects which remain underrepresented in digital corpora, hampering model accuracy and functionality.
- There is a technological lag as it is behind global benchmarks in accuracy, speed, and coverage especially for complex tasks like sentiment analysis, question answering, or summarization.
- There is a limited adoption among the citizens
- There is a lack of public awareness leading to underutilisation of the resources and tools under TDIL
Way Forward
Few of the suggestions which can be taken up to improve TDIL are as follows
- Promote EdTech Materials in Multilinguals: Create EdTech resources in various Indian languages, particularly for early education, vocational training, and rural areas.
- Policy and Funding Support: Provide stable funding, inter-ministerial coordination, and integration of language technology into national digital plans.
- Support for Endangered Languages and Dialects: Extend the ambit of TDIL to cover not only the 22 scheduled languages but also minor dialects and tribal languages to save linguistic heritage.
- Bhashini as a Flagship Platform: Utilize the National Language Translation Mission (Bhashini) as a flagship platform and leverage the legacy of TDIL to develop a shared, citizen-oriented language technology infrastructure.
Conclusion
The Technology Development for Indian Languages represents inspiring efforts by the Indian government to ensure that no citizen is left behind in the digital revolution because of language barriers. By fostering innovation, creating tools, and building datasets, TDIL has laid a strong foundation for inclusive technology. However, to meet the demands of a rapidly evolving digital society, the program must embrace new paradigms in AI, expand its collaborative ecosystem, and drive real-world integration. With consistent vision and support, TDIL can be the cornerstone of a truly multilingual digital India—where every language finds voice and value in the nation’s technological future.
References
Development of Technology for Indian Languages Uma Maheshwar Rao, G., University of Hyderabad https://www.education.gov.in/shikshakparv/docs/Umamaheshwar-Rao.pdf
Technology Development for Indian Languages https://lt4all.elra.info/media/papers/P5/79.pdf
Development of Technology for Indian Languages: Indian Government Initiatives by Sunil Kumar Srivastava https://lt4all.elra.info/proceedings/lt4all2019/pdf/2019.lt4all-1.91.pdf
IIT Madras Indic TTS https://www.iitm.ac.in/donlab/tts/
Ministry of Electronics and Information Technology (MeitY). (2023). Technology Development for Indian Languages (TDIL). Retrieved from https://www.meity.gov.in/content/technology-development-indian-languages-tdil
About the contributor
Urvashi Singhal is a master’s student at DTU, simultaneously pursuing actuarial science. She is currently working as a research intern on an ICSSR project focused on menstrual leave policy.
Acknowledgment
The author sincerely thanks the IMPRI team for their valuable support.
Disclaimer: All views expressed in the article belong solely to the author and not necessarily to the organisation.
Read more at IMPRI:
Reimagining Rural Governance through Vibrant Gram Sabha & Panchayat NIRNAY Portals
STARS: Scheme for Transformational and Advanced Research in Sciences


















