Indian languagesThe government of India recognizes 112 mother tongues that have more than 10,000 speakers. India has a total of 1,652 different languages and dialects! The study of language, linguistics, has been developing into a science since the first grammatical descriptions of particular languages by Panini, thousands of years ago. Sanskrit is the oldest language in the world, more than 4,000 years old. It is obviously much older than Latin.

History of Indian languages

The history of the Indian language branch is often divided into three main stages:

  1. Old, comprising Vedic and classical Sanskrit;
  2. Middle (from about the 3rd century BCE), which embraces the vernacular dialects of Sanskrit called Prakrits, including Pali; and
  3. New (from about the 10th century CE), which comprises the modern languages of the northern and central portions of the Indian subcontinent.

Vedic Sanskrit

The Rig Vedic Sanskrit is the oldest attested member of the Indo-European language family. The discovery of Sanskrit by early European explorers of India led to the development of comparative Philology. The scholars of 18th century were struck by the far reaching similarity of Sanskrit, both in grammar and vocabulary, to the classical languages of Europe.

Vedic Sanskrit, the language used in the Vedas, the sacred Hindu scriptures, is the earliest form of Sanskrit. A later variety of the language, classical Sanskrit (from about 500 BC), was a language of literary and technical works. Even today, it is still widely studied in India and functions as a sacred and learned language.

Sir William Jones, 1786 said “The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists; there is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit…”

The Middle Prakrits

Prakrit existed in many regional varieties, which eventually developed literatures of their own. Pali, the language of the Buddhist canonical writings, is the oldest literary Prakrit. It remains in literary use in Sri Lanka, Myanmar (formerly known as Burma), and Thailand.

The Prakrits continued in everyday use until about the 12th century AD, but even by about the 10th century, the modern Indo-Aryan vernaculars had begun to develop. The number of languages is difficult to specify. Roughly 35 are of some significance, particularly Hindi, Urdu, Bengali, Bihari, Gujarati, Kannada, Malayalam, Marathi, Oriya, Punjabi, Rajasthani, Tamil, and Telugu, each of which has at least 10 million speakers.

New or modern

Two major varieties of Hindi are spoken; Western Hindi, which originated in the area around Delhi and Eastern Hindi is spoken mainly in central Uttar Pradesh and eastern Madhya Pradesh; its most important literary works are in the Awadhi dialect (or Hindustani). The dialect that has been chosen as India’s official language is Khariboli in the Devnagari script. Other dialects of Hindi are Brajbhasa, Bundeli, Awadhi, Marwari and Bihari (Bhojpuri, Maithili and Magadhi).

Despite their separate names, Hindi and Urdu are actually slightly different dialects of the same language. Hindi vocabulary derives mainly from Sanskrit, while Urdu contains many words of Persian and Arabic origin; Hindi is written in the Devanagari script, and Urdu in a Persian Arabic script. Hindi is spoken mainly by Hindus; Urdu is used predominantly by Muslims.

Bengali is spoken in West Bengal and by almost the entire population of Bangladesh. It descended from Sanskrit, and has the most extensive literature of any modern Indian language. Oriya, Bengali and Assamese are considered to be sister languages.

Punjabi spoken in the Punjab, a region covering parts of northeastern India and western Pakistan. The sacred teachings of Sikh religion are recorded in Punjabi in the Gurmukhi script, which was devised by a Sikh guru.

Sindhi is actually an offshoot of some of the dialects of the Vedic Sanskrit. Sindh had always been the first to bear the onslaught of the never-ending invaders, and thus absorbed Hindi, Persian, Arabic, Turkish, English and even Portuguese. Sindh is where Persian and Indian cultures blended.

Other significant Indic languages include Sinhalese, the official language of Sri Lanka; and Romani, the language of the Roma (Gypsies), which originated in India and was spread throughout the world. The Sanskrit origin of Romani is apparent in its sounds and grammar.

The origin of most scripts for the Indic languages can ultimately be traced to Brahmi. Devanagari, a development of Brahmi, is used for Nepali, Marathi, and Kashmiri Hindus, as well as for Hindi, Sanskrit, and the Prakrits. Gujarati, Bengali, Assamese, and Oriya all have individual writing systems derived from Devanagari.

Dravidian Languages

About 23 Dravidian languages are spoken by an estimated 169 million people, mainly in southern India. The 4 major Dravidian tongues are recognized as official state languages—Tamil in Tamil Nadu, Telugu in Andhra Pradesh, Kannada (Kanarese) in Mysore, and Malayalam in Kerala. They have long literary histories and are written in their own scripts. Telugu is spoken by the largest number of people; Tamil has the richest literature, is thought to be extremely ancient, and it is spoken over the widest area, including northwestern Sri Lanka. Other Dravidian languages have fewer speakers and are, for the most part, not written. The Dravidian languages have acquired many loan words from the Indic languages, especially from Sanskrit. Conversely, the Indic languages have borrowed Dravidian sounds and grammatical structures.

HindiOfficial Languages of India

Hindi and English are the co-official national languages of India. In addition, the Indian constitution recognizes 18 state languages, which are used in schools and in official transactions. These are Assamese, Bengali, Gujarati, Hindi, Kannada (Kanarese), Kashmiri, Konkani, Malayalam, Meithei (Manipuri), Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Telugu, Tamil, and Urdu. The regional languages have been recognized as the official language of the States. In many cases, the state boundaries are drawn between linguistic lines.

Other Language Groups

The 12 or so Munda languages are spoken by people in scattered pockets of northeastern and central India. Of these, Santali is the most important, having the largest number of speakers and being the only Munda tongue that is written. Like the Dravidian languages, the Munda languages are known to have existed in India prior to the migration of people, from the Indus valley down southwards.

Linguists consider the Munda languages to be related to the Mon-Khmer languages of Southeast Asia in a larger grouping called the Austro-Asiatic family. One Mon-Khmer language, Khasi, is spoken within India, in Assam Province. A few Sino-Tibetan languages are also spoken along India’s borders, from Tibet to Myanmar.

List of languages by total number of speakers, 2014

Source: SIL (Summer Institute of Linguistics) International, Dallas,TX, USA

Language L1 speakers L2 speakers Total
Mandarin 850 million 500 million 1350 million
English 400 million 800 million 1200 million
Spanish 470 million 80 million 550 million
Hindi 250 million 250 million 500 million
Arabic 340 million 50 million 390 million
French 80 million 194 million 274 million
Malay 70 million 200 million 270 million
Russian 180 million 80 million 260 million
Portuguese 250 million 5 million 255 million.
Bengali 250 million 250 million.
Urdu 15 million 160 million 175 million
German 95 million 50 million 145 million
Japanese 122 million 1 million 123 million
Javanese 84 million NA
Telugu 74 million 5 million
Wu 77 million NA
Korean 77 million NA
Tamil 80 million 72 million
Marathi 72 million 3 million
Turkish 71 million 0.4 million
Vietnamese 68 million NA
Italian 64 million
W. Punjabi 63 million NA
Yue 62 million NA
Kannada 60 million 4.5 million