ISSN (Print) - 0012-9976 | ISSN (Online) - 2349-8846
-A A +A

Life to Indian Languages

A Linguist Responds to Javed Majeed’s Study of Grierson’s Linguistic Survey of India

Ayesha Kidwai ( teaches at the Centre for Linguistics at Jawaharlal Nehru University.

This paper presents a linguist’s response to the main themes that run through Majeed’s (2019) comprehensive and thought-provoking two-volume study of the Linguistic Survey of India and its Editor, George A Grierson. It argues that an important source of the complexity of Grierson’s subject position and the intellectual ambiguities in the LSI originates from the fact that the LSI is an unprecedented exercise in modern linguistics. Proffering a reading of the LSI’s methodology, the paper explores the ways in which linguists may participate in an interdisciplinary recovery of this important historical exercise and its afterlife.

Buried in a note on the language Lahnda to the first statement in the 2011 Census of India’s chapter on Indian languages is a mention of Grierson’s Linguistic Survey of India (1895–1928) (henceforth, LSI), but the solitariness of this reference belies its centrality to over a century of the decennial census operations in India. Since the 1901 Census, this to me of 11 volumes of 21 parts has been the touchstone by which language names returned by the census enumerators are rationalised, grouped, and classified, before politico-legal constructs like Scheduled Languages are used to make distinctions between “Languages” and “Other Mother Tongues.”

The LSI has had a long afterlife in other respects too—lauded by linguists for its ambitious and scholarly achievements, referenced by historians of colonial language policy, the LSI is also still occasionally cited by citizen’s groups, usually by way of proof of claims to the autonomy and/or existence of a particular language. Yet, until Javed Majeed’s two-volume tour de force on George A Grierson and his LSI—Nation and Region in Grierson’s Linguistic Survey of India (2018a) and Colonialism and Knowledge in Grierson’s Linguistic Survey of India (2018b)—there has been little of depth in terms of a study of the survey itself, either by linguists or by historians.

This paper responds to the main themes that run through Majeed’s study (which deserves the appellation of “monumental” as much as the survey it studies does) by charting out the ways in which linguists may contribute to its study, as well as proffering some thoughts on how the postcolonial state’s refusal to critically engage with the LSI has proved harmful to the development of India’s languages.

The Colonial State and the LSI

The LSI began its formal operations in 1894, three years after Grierson submitted a proposal for how the survey—intended to be “a collection of specimens of every language and dialect spoken in India”—was to be conducted to the Home Department. Over the next three decades, the LSI collated information about 723 speech varieties spoken in South Asia, with detailed lexical and/or grammatical information for 368 of them, as well as gramophone recordings of 97. Although the coverage of the LSI fell far short of its intended goal—the Madras Presidency or Burma was not covered, and it also failed to collect specimens for a number of speech varieties elsewhere—the LSI remains an exercise without parallel in either its time or any other.

In general, the colonial state’s investment in the LSI derived ultimately from what Goswami (2004) has characterised as the production and perpetuation of a “modern colonial state-space” in the post-1857 era. An important strategy in the creation of this colonial state-space were state-generated classificatory schemes, constructed by a “rule of difference” that mandated the “continual performance of the nonidentity between the state and its colonised subjects” (Goswami 2004: 64). The colonial state’s endorsement of the LSI was primarily because language was a parameter by which it classified “nationality.” It expected the LSI’s results to dovetail with the findings of its census with respect to the naming, territorialisation, and enumeration of Indian languages/dialects. The LSI would therefore bring clarity to a long-standing cause of colonial bewilderment: ever since the first attempt at a decennial Census in 1870–71, it had become apparent that the “natives” could not simply name the language(s) they spoke. In response to what was a simple question about the language ordinarily spoken in the household, census enumerators were provided hundreds of names, many of which were not language names at all. As J A Baines, Commissioner of the 1891 Census, observed:

The first impulse, in many cases, is to return the name of the caste as that of the language. For example, the potter gives ‘potterish,’ the tanner ‘tannerish’ or the weaver ‘weaverish’ as his mother-tongue, especially if he be either a member of a large caste or a stranger to the locality where he is being enumerated. (1893: 130)

Grierson himself, however, cited more than this one objective for the LSI. As Majeed (2018b: Ch 1) notes, he gave varying emphases to what they were on distinct occasions. On some, the LSI’s usefulness was cited as providing a manual for officials and magistrates; on others, it was presented as an academic reference work for language scholars and a transcontinental educated public. The LSI was also championed as pedagogically useful, as it could be used to measure levels of officials’ competence in Indian languages. In fact, one of the motivations for the Gramophone or Phonetic Survey (Majeed 2018b: Ch 5) was that it would buttress a colonial auditory order by training officials to speak Indian languages correctly.

That the colonial state was interested in only one of what its editor thought the LSI’s objectives to be is but one signifier of the “loose and flexible” relationship between it and the LSI. Majeed demonstrates that far from being either completely independent of, or a subordinate division, of the census,
Grierson’s manoeuvring of the project carved out a distinct space for the LSI. Even as its editor was left free to operate in “the grey area between officialdom and ‘private’ scholarship” (Majeed 2018b: 69), the LSI retained the imprimatur of colonial government right up to its culmination.

When the LSI was initiated in 1895 (Majeed 2018b: Ch 2), Grierson was not even officially associated with it and was given permission to work on it only in the time he could spare from his duties as the Opium Agent of Bihar. The budget allocated to the LSI was a paltry `2,000 per annum for three years when by Grierson’s own estimates the survey demanded a total budget of `3.5 lakh plus expenses. To carry out what Grierson knew to be a “thousand-men job” (Majeed 2018b: 55), not one civil servant was put in charge of the LSI, which never had a fixed office of its own, and its work of specimen collection was simply added to the routine duties of district officials and political agents. When Grierson made a specific request in 1896–97 to be designated as “officer on special duty” to the LSI, it was summarily turned down.

It was only in 1898 when Grierson declared that his failing health required him to return to England and abandon all association with the LSI that the India office gave in to his requests to be assigned to the LSI full-time. From 1899 until his retirement in 1903, Grierson worked on the survey from England, subsisting on his pension and a meagre annual allowance of £400 for expenses (reduced post his retirement to £300). Nevertheless, this thrust and parry with the colonial bureaucracy did not affect Grierson’s involvement with the LSI or ever lead to a rupture of his ties with the government or the census, whose cooperation was crucial he knew, if the data were to be collected and for the LSI’s results to see the light of day.

The survey formally began in 1895 and had three stages: First, in order to compile lists of the known languages, Grierson sent out forms to government officials for information about names and estimates of speaker strength. Using this information, in the second stage, officials were tasked with gathering three distinct types of specimens, so as to assemble a comparable corpus across all the languages surveyed: one specimen translation of the Parable of the Prodigal Son,1 a second of free narrative text, and a third, translations of a schedule questionnaire of words and sentences. The last stage was the classification and analysis of specimens, the writing up of skeleton grammars, and finally, the composition and printing of the volumes.

Of these three stages, the first two were completed while Grierson was in India. The bulk of specimens came in within three years by 1900 in their many thousands. The editing of the specimens, which began in 1898, and the preparation of the LSI’s volumes for publication is what largely consumed the nearly three decades that followed. The fact that this, perhaps the first and most massive, language documentation exercise was completed at all, was down to just one man—its editor, George A Grierson.

The Editor of the LSI

One of the greatest achievements of Majeed’s study is the portrait that he sketches of Grierson, whose persona and single-minded pursuit of the goals he set for the LSI were critical in determining its scope and character. Majeed’s comparison of the LSI to a planned but failed Linguistic Survey of Burma (2018b: Ch 2) is illuminating because the failure of the Burmese Survey to even get off the ground ultimately boiled down to one factor alone: it lacked a Grierson who could “cultivate his aura and his public relations skills ... [to] manage his complex position and his persona as an amalgam of semi-official colonial servant in retirement while on ‘Special Duty,’ survey superintendent, academic philologist and author in his own right” (2018b: 61), often assuming the role of a “go-between” and “knowledge broker” (2018b: 94) for Indian languages and the categories of grammatical thought.

While Grierson’s semi-detached status accorded him a great degree of latitude and autonomy, it also presented several challenges. The formal status of official designation aside, Majeed (2018b: 55) observes, that if Grierson was to be assured of the continued backing of the colonial state for getting the LSI’s work done by its officials, he would have to “create an aura around his name.” That this stature was achieved is a fact, by the exceedingly broad network of professional relations that Grierson embedded himself in. He was the authority about Indian languages and dialects for the 1901 Census, authoring its chapter on languages. Even though the LSI grew apart from the census thereafter, Grierson and census officials remained in loose collaboration even in the subsequent two censuses, with Grierson incorporating census information and maps into the LSI’s volumes, and the censuses making a sincere (but often unsuccessful) effort to map the LSI’s classifications on to its own enumerations (Majeed 2018b: 56–60).

Grierson’s “aura” as an authority on Indian languages was enhanced by his own prolific scholarship. Having widely published on the grammatical properties, translations, and literatures of Indian languages even before the LSI, Grierson was highly regarded by British and international scholars. He was already a member of various learned societies like the Asiatic Society, the British Academy, the Congress of Orientalists, the Society for Authors, to name just a few; in fact, the original proposal regarding the conduct of the LSI was first formally tabled at the Vienna Congress of Orientalists in 1886.

The remarkable fact that Majeed foregrounds is that Grierson’s own sense of scholarly community was far from restricted to British/European Sanskritists/Aryan, but included Indian and Asian scholars as well. Grierson had a long-lasting intellectual and professional relationship with the likes of S K Chatterji, D R Bhandarkar, A Coomaraswamy, as well as scores of others. An academic referee for many Indian linguists, Grierson spent a lifetime of genuine scholarly exchange with them, helping them to shape and publicise their work. For perhaps the first time, South Asian scholars came to be so included as a part of a transcontinental enquiry (Majeed 2018b: Ch 6).

Perhaps of most significance in cementing Grierson’s prestige amongst South Asian scholars was the manner and extent to which he formally collaborated with them, both in the LSI and beyond (Majeed 2018b: Ch 8). Not merely the invisible scribes of the LSI, “Indians and their work feature heavily in the LSI’s culture of learning and its transcontinental epistolary culture of textual circulation and exchange” (Majeed 2018b: 224), with the Bibliographies of several languages citing works by Indians as part of the list of authorities and with several analyses in the LSI being directly attributed to their composition, collection or analysis. Many Indian grammarians are acknowledged as having edited and proofread volumes of the LSI, in a collaborative process so “marked by diverse hands and by a dialogical process of query and responses” (Majeed 2018b: 217) that Grierson’s own name becomes “a coalescence of authoring names ...” in which the functions of an author are shared by multiple persons, both “British” and “Indian” (Majeed 2018b: 231).

Grierson’s prestige was also nurtured by his willingness to be the advocate of India’s languages beyond his role as a philologist. Spanning several decades, Grierson engaged with a host of actors in the newly emerging subnationalisms amongst Maithili, Kashmiri, Asamiya, Bengali, Konkani, Marathi, and Telugu speakers. For example, in the upper-caste Maithili movement, which was inextricably linked with the restoration of the Maharaja of Mithila’s powers, Grierson was far from loath to publicly ally himself with the Maharaja; for Asamiya, he acted as an arbitrator of its claims to an independent language—“as the speech of a distinct nationality, and ... a standard of its own ... whether its grammar resembles that of Bengali or not” (Majeed 2018a: 19)—allying with popular organisations like Assam Student’s Welfare League and the Assam Sahitya Sabha. In the Kashmiri cause, he championed the perspective of his most constant collaborators, the Kashmiri pandits, presenting the language as an aspect of “an upper-caste Hindu Kashmiri culture beleaguered by the Muslim majority population” (Majeed 2018a: 35).

Grierson as a Cross-border Figure

This last affiliation to Hindu upper-caste male elites was actually an abiding political choice that Grierson made throughout his association with India. Majeed argues that the India Grierson knows, and the LSI presents, is civilisationally Aryan, with “all other non-Aryan groups [being] positioned around this term” (2018a: 128). In general, Grierson’s thinking was far from free of colonial categories of race and anthropometry (2018b: Ch 1), and irrespective of his sympathies for certain subnationalisms, Grierson’s views were consistently arraigned on the side of Empire and against nationalist critiques of the colonial rule. As Majeed argues, although Grierson’s embrace of imperial ethnology may well have been mediated by his triply-hyphenated “Anglo-Irish-Indian” identity (2018b: Ch 3)—his Anglo-Irish provenance and Protestant religion predisposed him to a wariness of the “cultural and linguistic assimilation accompanying British rule” (2018b: 91)—he can only be read at best as a “cross-border figure,” whose “moving between Britishness, Irishness, and Indianness” (2018b: 11, et passim) is simultaneously both enabling and disabling.

This is, perhaps, the most evident from Majeed’s stellar discussion of Grierson’s code-mixed private language (2018b: 81–84) and its suggestion that in the self that Grierson inhabited, no absolute bounds between Englishness and Indianness existed: his was “a linguistic self split between colonial guardedness in public and a hybridised English-Indianness in private” (2018b: 84). However, there is also no indication that Grierson saw the selves of colonial subjects as similarly multilingual, and not to be located in a “nationality” defined by a unique mother tongue, but in an ecology of languages. Ironically for an endeavour that relied so crucially on the group and individual multilingual capability of its workforce and informants for its existence, the LSI does not record the “other” languages its speakers know and use.

In this sketch of Grierson’s complex persona, Majeed suggests, lies at least part of what makes the LSI the document that it is: simultaneously an expression of colonial categories of differentiation, territorialisation, and fragmentation, as well as the countervailing narratives to these categories. In the next section, I turn to Majeed’s insightful discussion of the ways in which Grierson’s LSI undermines a reproduction of the analytic categories and command narrative of colonial power.

Countervailing Narratives in the LSI

Majeed’s central thesis is that the LSI fails to oblige the colonial government with the deliverables it had initially promised: unitary and unique names of Indian languages and dialects, grammatical descriptions that “fix” their properties into discrete entities that are clearly distinguishable from each other, and a cartographic plotting of where each was spoken in India.

The manner in which the LSI grapples with the plethora of names that it received for languages/dialects (Majeed 2018a: Ch 3) is exactly the obverse of what the Census officials would have expected from it— rather than reifying them, “it goes out of its way to call attention to their plenitude” (Majeed 2018a: 73), and only complicates, rather than reduces, the initial list of language names that was compiled in 1897. When the specimens for the LSI arrived, similar or identical specimens were often reported under different names. In its final “Classified List of Indian Languages” (Grierson 1927: 389–420) often chooses to leave the former issue unresolved, thereby “dramatising the proliferation of language names as a principle” (Majeed 2018a: 75). The list contains numerous instances of doubly or triply named languages, besides referring to notions like “nicknames,” “by-names” and “ghost names” (Majeed 2018a: 76). Further, even fixed and well-known language names are defamiliarised because the variety referred to by the name was frequently different—“the gloss for the entry of Hindi is a swirl of names” (Majeed 2018a: 77), including in its referents, Hindustani, Kanauji, Multani and Lahnda. Thus, rather than fixing names for Indian languages by fiat, the LSI presents India as a “fertile space of multiple naming” (2018a: 82), a space that it makes no effort to close off. Indeed it adds to this list by including invented names like “Rajasthani,” “Bihari,” amongst others, confessing freely to their “concocted” nature, and justifying their inclusion in the list because the referents of such terms is a group of varieties that share grammatical properties.

The Census’s confidence that the LSI would plot languages onto the colonial state’s imperial geography was also belied by the survey’s ultimate results. The LSI, Majeed points out, stresses “the fictiveness of drawing boundaries in linguistic maps” (2018a: 56), and the difficulties inherent to fixing boundaries between languages. In a letter of Grierson’s
from 22 January 1902 that Majeed reproduces (2018a: 57–58), Grierson puts it quite well:

Of course to talk of boundary lines at all is really absurd, for languages are not divided by a sharp line, but merge into each other, and if, for the sake of definiteness, we do put down boundary lines, the language for a considerable distance on each side must be of a very indefinite character, and people will always differ as to where, exactly, the lines should be put on a map.

When the LSI does draw boundary lines between languages (and their dialects), these lines are usually qualified as provisional, based not on the linguistic facts, but on the vantage point of the observer. This is because in the borders between languages, “on each side of the conventional line there is a border tract of greater or less extent, the language of which may be classed at will with one or other. Here we often find that two different observers report different conditions as existing in one and the same area, and both may be right” (Grierson 1927: 30–31). For Grierson, the truth that the map cannot capture is that the correct visual descriptor for the language map of India is “shading,” by which India is essentially “an unbroken chain of dialects, all imperceptibly shading off into each other” (Majeed 2018a: 59).

Furthermore, because almost all the language families discussed in the LSI have linkages to the families outside the subcontinent, “the enormous dialectal continuum that is India cannot be framed by the internal versus external distinction of British imperial mapping and its later nationalist incarnations” (Majeed 2018a: 62). As the languages that Grierson studies traverse political frontiers, the philologist who studies them must cross borders along with them too, moving far beyond what Grierson called “India Proper” to “Further India” and “Greater India,” etc. The LSI’s reconstructed linguistic history of India as waves of migration from these other “Indias” (2018a: Ch 5), and its depiction of it as space where, for most communities, home is always mobile, renders an alliance with the colonial cartographic project impossible.

Even the knowledge archive produced by the LSI, Majeed rightly observes (2018b: Ch 7), makes no claims to “confident colonial or epistemological mastery” (2018b: 206). Rather, knowledge about linguistic phenomena is often presented as provisional and incomplete. Many of the LSI’s descriptions are replete with Grierson’s frank admissions to limitations imposed by the quality of data, the inconclusiveness of particular analyses, and the gaps in knowledge that need to be filled if definitive conclusions are to be reached. In an empirical field, such as descriptive linguistics by fieldwork in languages unknown to researchers, such confessions are far from defects, as language data cannot simply be invented. If anything, the use of a language that conveys to gloss over logical gaps in reasoning would have left the LSI short of all credibility altogether.

Reading the LSI with Majeed, as a Linguist

Such is the rich seam of themes on the persona, career, and opinions of Grierson and his survey contained in Majeed’s exquisitely written and pioneering study that it is difficult for any reviewer to engage meaningfully with the depth and range of his observations on several disciplines in the humanities and social sciences. Reading Majeed’s study just from the discipline of linguistics, this masterful critical recovery of the LSI and Grierson is not merely informative, it offers an opportunity for linguists to engage with the study of the premises, methods and analyses offered in the LSI. While Majeed’s discussions illumine key aspects of the LSI’s basic methodology, his strokes are (understandably) broad when it comes to its innovativeness. One way that linguists may contribute to an interdisciplinary recovery of the LSI is to inquire into the ways that its linguistics interface with colonial categories of thought.

The proposition I would like to press is that linguistics play a pre-eminent role in creating the intellectual ambiguities Majeed notes in the LSI. As Majeed himself mentions, the project of the LSI came into being at a point when race had ceased to be a sufficiently convincing “master category” for differentiating groups of people—“nowhere are there presented stronger warnings against basing ethnological theories on linguistic facts than in India” (Grierson 1927: 28)—and developments in the field of linguistics had displaced (comparative) philology as the dominant mode in which the variation between languages was to be discussed (see Majeed 2018b: Ch 3; 2018a: Ch 4). Grierson’s LSI was unequivocal in its embrace of this modern position, declaring itself concerned only with the study of Indian languages alone, and not the urheimats of the “races” that speak them. In doing so, the LSI resolves to adopt as its explanandum a fairly abstract object of inquiry, by which what needs to be studied in a “language” is the linguistic phenomena its grammar instantiates, for example, the linguistic forms that express tense, subjects, verb agreement, definiteness, and the like, the order it arranges its words in, and not what its speakers name that language, or the kinship relations by which those speakers organise themselves, etc. As a result, what can count as a “language” for the LSI is a cluster of linguistic properties, and claims about the distinctness or relatedness of languages must be formulated as the sharing or divergence of those linguistic properties alone.

However, for the project of the LSI, this notion of “language” could not be adopted as the solely operative one, not the least because it could only be successfully conducted in alliance with the colonial state, which uses an admixture of definitions, ranging from notions like nationality to genealogy to similarity in name, to define what a language is. Furthermore, given the reality that at the point that the survey was initiated it is simply not known what the possible clusters of linguistic properties could be, or even the numbers of possible abstract languages that are “out there,” Grierson had to proceed with a language’s social name as the putative unit of investigation and use linguistic principles to create the abstract object that linguistics can typologise and classify. However, this decision, in turn, produces an internal tension in the LSI, as while inclusion its database is made on the basis of its salience in terms of popular categories, its classification has to be justified exclusively on its linguistic properties.

In order to resolve this tension, Grierson seems to have taken the route that decisions that derive from social causes cannot be determined by anything but subjective criteria. For example, in an engaging discussion on the difficulties of distinguishing which variety counts as a “language” and which a “dialect” (Grierson 1927: 22–24), Grierson lists two extralinguistic, sufficient (but not necessary) conditions that may be used: the imagined notion of “nationality,” and the existence of a wealth of literature. These subjective criteria do not require the same systematic application as linguistic criteria do, so Grierson finds no contradiction in relaxing them from time to time across the LSI. For example, Grierson feels free to use religion to define nationality to characterise Hindi, and not Urdu, as “genuinely Indian” (Majeed 2018a: Ch 7), privilege Lahnda (Siraiki) over Punjabi as a “language” by applying a novel criterion of Siraiki’s imagined proximity to a widely-spoken ancestor. Yet, in the linguistic descriptions proper of these four languages, each variety is described using purely linguistic argumentation, and the subjective criteria play absolutely no role in the degree of depth of analysis it receives.

What the LSI has to say about a speech variety is thus often cleaved into two distinct portions—a “core” that describes through verifiable data and grammatical argumentation, the linguistic facts, and a “penumbra” that is a composite expression of Grierson’s alliances with civil servants and census officials, scholars and language nationalists, as well as his political perspectives and allegiances. I would like to suggest that the polyphonic nature of the LSI’s reports on Indian languages, where both colonial and countervailing narrative seem to equally sound, are actually a percept created by reading its core and periphery together. It is then worth exploring how the same individual can write the core can simultaneously author the penumbra as well, leading minimally to the diagnosis of a complex subjectivity, but also raising questions as to how such a dissociative disorder is sustainable, in which the premises of the mind doing linguistics are housed in a different silo from the one in which political beliefs are. Or is the penumbra of the LSI more of a gesture of accommodation of the colonial discourse, whose patronage the project needs in order to continue, but into which its core does not quite fit?

I would like to think that the answers to these questions could be found in a study of the linguistics that Grierson and his colleagues do at the core of the LSI, because in almost every crucial respect, Grierson’s linguistics is the countervailing narrative. In the next section, based on my own studies of the LSI,2 I offer some programmatic speculations as to how Grierson’s and the LSI’s archive could be explored to develop this understanding.

The LSI’s Linguistic Methodology

In its time, Grierson and his LSI were hailed as “monumental” by linguists across the world (Majeed 2018b: 179–82), but perhaps the most telling citation is to be found in a letter to Grierson, written by Professor Bani Kanta Kakati of Cotton College Assam, who described the LSI as illuminating “with a pillar of fire the entire landscape of Indian linguistics” (Majeed 2018b: 180). For Indian linguistics at that time, and ever since then, the LSI has indeed been the torch that reveals to Indian linguists both the field of their activities and the importance of the field itself. Inspired by the LSI’s exposition of the sheer scale of India’s linguistic diversity, Indian linguists has long held that an exercise like the LSI needs to be reproduced (not necessarily replicated) by a contemporary India (Kidwai 2017, 2019), proceeding on a belief that a collaborative exercise of Indian linguists working in tandem is bound to succeed. A historical recovery of the way Grierson’s methodology evolved would be greatly useful, especially in the wake of Majeed’s study, as it illumines the ways in which the LSI may be studied at a global level.

Very little is known about exactly how Grierson’s conception of the LSI as a collection of specimens arose, and the process by which decisions about how these specimens would be presented evolved. As Pandit (1975: 73) has recorded, the initial proposal for the LSI in 1888 did not propose specimen collection at all. Rather, the suggestion was that “a skeleton grammar suitable at once for all the Indian languages of Aryan family should be concocted at first, be concocted at a meeting ... and then the whole of India with Aryan languages be mapped out probably (and also conveniently) district wise.” Individual grammars and lexica for each language could then be compiled. A similar exercise would be done for the Dravidian languages, and as for the other languages, Grierson suggested the help of missionaries, with the provision for awarding them “handsome honoraria” (Singh 1969: 117).

Had the initial proposal gone through, the LSI would certainly not have been able to yield the language groupings of the non-Aryan/Dravidian languages or demonstrate the
extent of language contact and convergence that it does. It was only when this proposal was turned down by the government, presumably because an execution of the LSI along these lines demanded a level of linguistic knowledge that was simply unavailable to the survey’s enumerators (schoolteachers, officials, missionaries, and “native gentlemen”), that the proposal for a collection of specimens was made, and accepted.

Problems in Specimen Collection

Grierson (1927: 19–21) discusses at length the challenges that the LSI process of specimen collection and creation faced. Despite copious instructions on how they were to be collected, specimens were of variable quality and often contaminated by the fancies of the specimen collectors. They were also received in a multitude of scripts and widely varying quality of translation, often having to be rendered to the Roman via other intermediary scripts (Majeed 2018b: Ch 7), from which Grierson and his collaborators had to prepare a romanisation for publication. Romanisation, as Majeed points out (2018b: Ch 4), presented a troublesome challenge, as the LSI did not use the International Phonetic Alphabet, which was still being developed. Therefore, specimens often needed to be corrected via months of correspondence. A study of this correspondence would be interesting both for the development of Grierson’s transcription system (which had to be in a form that it would allow him to extract generalisations about the phonetics, phonology and morphology over the entire set of speech varieties the LSI covers), it would also shed light on how Grierson’s notion of the LSI “specimen” developed.

A linguist’s examination of the development of the format of specimen presentation would be especially illuminating. Right from the first published volume, the LSI’s specimens were standardly presented in a format that gave a word-level interlinear gloss as well as a free translation. Consider as an example, a Khasi specimen from the Mon-Khmer and Sino-Chinese languages in 1904.

A great deal of linguistic analysis is needed in order to generate the English interlinear glossing we see because the language is not used here to provide a translation of the sense (which the separate free translation provides), but as a metalanguage that represents the linguist’s word and morpheme-level analyses of its structure (as an example, the gloss given to katto-katue in line four of the Khasi text above). Although the use of interlinear glossing itself was far from an innovation, having been used in European translations from historical languages and missionary grammars, Grierson’s use of this as the standard format for the presentation of thousands of texts is unprecedented. It signals the LSI’s claim to the verifiability of its hypotheses, and therefore its openness to correction; a claim that is underscored by the LSI’s attribution by name to the person who prepared the specimen.

Interlinear glosses are, however, not enough in themselves to reveal the grammatical properties of the speech variety. To see this, consider the treatment given to the morpheme ka- in Khasi above, which is always marked as separate in every word that it occurs. In just the first three lines, it is glossed variously as “she,” “the,” “that” and “it,” and also left unglossed in five instances. In order to understand what exactly the function of ka- in all its uses in the specimen are, the reader must be able to refer to a linguistic description of the variety, that is, the set of hypotheses about the linguistic properties of the language. The skeletal grammars that preface the specimens as a result are based on the extant grammars of the language, as reinventing the wheel was not a luxury that the LSI cannot afford (although it must, in instances in which the hypothesis space contains more than one analysis, it must choose which one to adopt), and hence, its air of collaboration and extensive referencing. Original linguistic analysis was reserved for varieties that had been hitherto unstudied. Again, by listing the sources of these hypothesised grammatical properties, the LSI makes a simultaneous claim to their falsifiability.

At its very minimum, Grierson’s skeletal grammar aims to characterise the specimens as completely as possible; in other words, the necessary condition a linguistic property must meet in order to be included in the skeletal grammar is that it must be contained in the specimens. For a number of languages, the skeleton is extremely attenuated—invariably the ones about which little was known before—but for others, the description is much richer, because greater understanding leads to perception of connections between linguistic properties. In both types of skeletal grammars, however, the goal appears to have been to achieve a level of descriptive adequacy. Given the issues with the quality of specimens mentioned earlier, incompleteness of knowledge, as well as uncertainty or dispute amongst various authorities about the correct linguistic analysis, this goal is often not attained, but the sincerity of the effort to answer the question “What do we know, and how can we describe it by rule?” in each case is without doubt. It is because the LSI asks this question every time it tries to describe a specimen via a skeletal grammar, it is suffused with a self-reflexive evaluation of its own completeness, as well as an awareness of open-endedness and the contingency of its claims.

Whether minimal or extended, the skeletal grammars of the LSI also define the core linguistic parameters using which speech varieties come to be grouped in language families and groups as well as dialects. All aspects of the language description are used to differentiate between varieties—pronunciation, lexis, morphology, syntax—but it is chiefly morphology (morphological type, morphosyntactic properties of word classes, and morphological exponence of syntactic relations), that function as the macro-parameters of Grierson’s typology. Dialectal variation is described in terms of varying expression of these parameters, but as a variation based mainly on differences in pronunciation and lexis. In most if not all cases, no attempt is made to construct arguments as to why a particular variety counts as the “standard,” because no real linguistic justification is possible—usually, the LSI always just names the variety it considers to be the “standard” dialect, and does not make any attempt to weight the linguistic parameters so that a few count as more significant than others in identifying the standard.

For the most part, Grierson’s parameters work exceedingly well for the purposes of classification of language, yielding only a few languages that he must remain non-committal about. This is not to imply that Grierson’s groupings and classifications were correct in every respect—they were not—but this apparently simple typological scheme could achieve full coverage of the database because, in Grierson’s scheme, the skeletal grammars (implicitly or explicitly) record each language’s values for certain macro-parameters. They also enable him to make significant generalisations about the family/group as a whole, as well as attend to areal phenomena of language contact and convergence.

Grierson’s LSI is thus a pioneering initiative in developing both the then-nascent disciplines of descriptive linguistics, language documentation (that too using a collaborative methodology), as well as language typology, facts that have rarely been acknowledged by the history of linguistic thought, either in South Asia and the world. Of this method, a linguist can ask many questions. Just to cite a few: Was this the conceptualisation of method and final product that Grierson had in his mind when he started? Who were his precursors and interlocutors in finalising this method? What criteria were employed to select specimens for analysis? Given that the principal data collection for nearly all the languages was completed in the initial period of the Survey, what kinds of questions were asked to correct and improve the specimens?3 Were the skeletal grammars inspired wholly or partly by the grammatical exegeses cited in the list of authorities, and if so, to what extent? Which were the Indian traditions of grammar-writing that proved most influential? That scores more can be added to this list is without doubt, just as is the fact that in formulating all these, Majeed’s recovery of Grierson and the LSI will always be an important landmark.

The Afterlife of the LSI

In this final section, I would like to turn to the political legacy of the LSI, a topic which Majeed also briefly discusses in the conclusion to Nation and Region in Grierson’s Linguistic Survey of India. Making the point that the LSI never really had much of an impact in the language policies of the colonial state—essentially because “by drawing attention to India as a linguistic region per se, the Survey was at odds with the colonial state’s conceptualisation of India, in which religious and caste differences were key in its understanding of Indian society” (Majeed 2018a: 200)—Majeed suggests that the chief political consequences of the LSI were felt by the national movement and expressed in the policies of the postcolonial state.

For the national movement and the state that resulted from it, the fact that the LSI “played an important role in the creation of affectively charged fields around some regional languages” posed the political problem of recognition of languages as “emerging and newly charged regionalised entities in the subcontinent” (Majeed 2018a: 201) and to balance these with the supremacy of the Indian Union. The postcolonial state’s caution in using language as a deciding factor for the reorganisation of states and its invocation of Grierson’s stress on the indistinct boundaries between languages to deny certain demands, is, however, more of an instrumental use, as the leaders of the Congress were hostile to Grierson’s sympathies with Hindu nationalism (Majeed 2018a: 203–05), were also generally sceptical of the LSI, which they tended to see as a counterpoint to their own assertions of a fundamental political and cultural unity of India. As Majeed points out, Jawaharlal Nehru was scornful in his Autobiography (1936: 453–54) and in The Discovery of India (1946: 169) of the colonial tendency to overstate India’s diversity, by which fictions of the minds of “the philologist and the census commissioner” who took every dialectal variation and every “petty hill-tongue” with a few speakers to count as “a separate language.”

While it is true that the nationalist reaction was more a rejection of an “epistemological balkanisation” of India into a conglomeration of small nationalities rather than a rejection of the very idea of India’s linguistic (and scriptal) diversity per se, it is also worth considering that the hostile nationalist reaction to Grierson and the LSI may have had a more potent afterlife. I would like to suggest that it did and that it is reflected in the manner in which the postcolonial state failed at the very outset to guarantee equal treatment to the emerging linguistic subnationalisms and those that were not so fortified. The source of the problem, I would like to argue (like several linguists before me—but to name just a few, Gupta and Abbi [1985], Agnihotri [2015], Babu [2017]), is to be found in specific provisions of the Indian Constitution relating to its languages.

On the face of it, the Constitution and the state it produces is committed to guarantees for the maintenance and perpetuation of India’s linguistic diversity; however, nearly three-quarters of a century later, it is amply clear that far from producing a society in which all languages thrive, the linguistic configuration remains quite akin to a colonial one. In the words of Agnihotri (2020: 2):

In each of the South Asian countries, we notice that the dominating Multilinguality consisting of English and some national/official/regional language acquires so much power as to become the source of serious discourse in all domains of activity … Groups speaking tribal/ethnic languages, nomadic and peripatetic groups of various kinds, migrant labour, immigrants, people displaced because of urbanisation and development, and women and persons with disability suffer a great deal under the domination of languages of power. Each one of these groups has its own verbal repertoire; it is just that it belongs to the category of the dominated Multilinguality. ... The most effective way of marginalising a community is to ignore or silence its voice.

Although there is more than one cause for why this situation has come to pass—and a discussion of all the causes and the events here would take us too further afield—I would suggest that even at its inception, a deep scepticism about its own ability to deal with India’s linguistic diversity infected the Constituent Assembly, resulting in a constitutional frame that has consistently produced conditions that have privileged Agnihotri’s “dominating multilinguality.”

Despite the fact the Constitution is careful not to use a language that records bias, such as “national language” or “dialect,” “vernacular,” etc, or legitimise discrimination on the basis of language, its provisions arrange Indian languages in a hierarchy of prestige, as Babu (2017) powerfully argues. Certainly, the inclusion of Schedule VIII (then Schedule VIIA) in the Constitution—which purports to be a list of the nationally important languages of India (though simply entitled “Languages”)—is the most glaring example of this hierarchisation,4 which has then been reflected in policies that dedicate the overwhelming majority of union funds and patronage just to these languages.

Hierarchy of Languages

Even within the scheduled languages, however, Hindi has an asymmetric power: Not only is it the official language of the union (Article 343), it is the only scheduled language that the state is required to promote (Article 351). Languages adopted by the states as their official languages (Article 345) may have recognition: but a language that “enjoys “official” status at the state may not have any special recognition at the level of the union” (Babu 2017: 114). Within the hierarchy of languages, a non-scheduled state language is therefore positioned at the third rung of the hierarchy.

There are also more subtle ways in which a hierarchy of languages is implied even in provisions that refer to the protection of rights of linguistic minorities. In two instances, the term “mother tongue” is used, but both the times in terms that invoke a contrast with the languages higher up in the ranking: Article 120 allows the exceptional use of the mother tongue in Parliament for a member “who cannot adequately express himself in Hindi or in English,” whereas Articles 350/350A provide for mother tongue instruction at the primary level for “children belonging to linguistic minority groups.” While the latter is in principle a provision aimed at maintenance of linguistic diversity as well as accessible early childhood education, it is rendered fruitless because the Constitution does not even define what a linguistic minority or a mother tongue is (and includes minority languages like Sanskrit in Schedule VIII); as a result, almost all languages of India fall through the sieve of this so-called protection. Finally, the onus for the development of all non-scheduled languages other languages lies on the communities that speak them: Articles 29 and 30 guarantee non-discrimination by the state to any section of citizens engaged in conserving their “distinct language, script or culture” or against the institutions they run, but no promises are made regarding their development. (In fact, the Constituent Assembly rejected amendments supported by members that the word “develop” be added.)

It appears that the LSI was not mentioned in the Constituent Assembly’s debates in any significant way, yet I would like to suggest that it was in the room, as the only official record, public document, and popular legend of India as a land of many mother tongues. It was certainly there in the person of Jaipal Singh Munda, who as he argued for Mundari, Gondi, and Oraon in Schedule VIII,5 asked and answered the simplest and most important question of them all: “What is a language? A language is that which is spoken.” It was also there in the room in the choice of a constitutional scheme that casts plurality and difference as disparity. Instead of a critical engagement with the LSI, one that simultaneously critiqued its colonial categories and strengthened its acknowledgement of Indian linguistic diversity,6 the postcolonial Indian state chose to encourage the formulation of policies that respond to largely political heft alone. While such heft may have forced the elite club of Schedule VIII to open its doors once every two decades or so to let a couple of languages fit in, for the vast majority of languages, who are either not large or organised enough to undertake such jostling effectively, a loss of voice and marginalisation has been the normal consequence.

Another route was always available to the makers of our Constitution. Suppose that instead of a select list of languages, Schedule VIII had been declared “open” to listing as many of the mother tongues identified by the LSI (or any survey building on its results) as a national language, with commitment made to their promotion and development? What if, instead of mimicking the colonial state’s response of vague gestures at linguistic diversity, the Constitution had thought of mother tongue commissions dedicated to this purpose? What if all these changes were made to flow from constitutional commitments to equality, the substantive actualisation of cultural rights, a political commitment to affirmative action, goals that colonial philologists and census commissioners did not subscribe to? It is very likely that a very different linguistic landscape for India would have resulted, in which decade after decade, the number of raw mother tongue returns would not fluctuate in the manner that they do—for the 1991 Census, the returns were 10,400, only 6,661 for the 2001 Census, and a humongous 19,569 in the 2011—for it is perhaps only languages that remain unseen and unheard by the state that need to clamour for recognition under the guise of so many names.

One thing is certain though, that had the Indian Constitution not enshrined what Babu (2017) has evocatively termed the “Chaturvarna” system at the base of the postcolonial state’s language policy, the celebrations of the “genius of Indian mother-tongues” in India’s 2020 New Education Policy would not have rung so hollow.7 When the last time Indians ever got an official count of the number and names of mother tongues they speak (1652) was nearly 60 years ago when government documents inform us that only between 69 and 72 languages are used in schools in India, that radio programmes are aired in 146 languages and dialects but newspapers and magazines are printed in only 101 languages, the suggestion that school and university education in them is supposed to be suddenly is laughable.8 Without a change in language policy, and a reinterpretation of the constitutional frame to ensure equality, scripts, and prestige to all languages, encomiums to the richness and complexity and “genius” of Indian languages are not worth the paper they are written on.


1 The choice of the parable as the template text for translation was, it must be emphasised, driven by pragmatic considerations relating to the actual data collection process of the LSI, rather than an orientation to proselytism in the survey. As Grierson (1927: 19) explains, the basic premise that the LSI had to proceed with was that almost none of its fieldworkers or informants would be English-speakers. As the parable was the single most accurately and widely translated English text (as part of Bible translations created by local missionaries), the LSI staff were provided a compilation of 65 known translations of it, so that “whoever might have to prepare a specimen, even if he did not know English, would find in this book, at least one version from which he could make a translation.” This method proved largely successful, as thousands of specimens of the parable were received, but nowhere in the LSI is this particular specimen given any special weight, either in terms of obligatory reproduction for every variety, or in grammatical analysis.

2 The speculations on the architecture of the LSI presented in this section were developed in a seminar course taught in the Winter Semester 2019 at Jawaharlal Nehru University. I thank all the students enrolled in the course, and particularly Sakshi Singh, Aseema Karandikar, and Opala Hajeya, for extensive discussion on several of the issues presented here. I am also grateful to Rama Kant Agnihotri and Karthick Narayana for discussion on many of the issues in this paper. In all cases, needless to state, the usual disclaimers apply.

3 We already have a preliminary sense of how dialogic the process of specimen preparation and editing was. Amin (2011: 12–14) presents interesting examples of this process in his discussion of Ram Gharib Chaube’s preparation of the specimens of Chattisgarhi, Laria, Bagheli, and Sadri Korwa varieties, which are full of marginal notes addressed to Grierson, proffering alternative suggestions for the interlinear glosses informed by ethnography. Most of these suggested translations—99%, Amin claims—were ignored by Grierson, who opted for “meaningless literal meaning” (p 14) in the gloss, no doubt in observance of the strictly linguistic objectives that the interlinear translations were supposed to serve.

4 The reason why this schedule was included in the first place was entirely opaque, as at the point of its adoption, no special benefits were supposed to accrue to the languages included in it. (In all likelihood, the schedule came into existence to accommodate the “hurt sentiments” of the vocal subnationalisms slighted by the adoption of Hindi as an official language, as Agnihotri [2015: 54–55] indicates.)

5 Constituent Assembly Debates on 14 September, available online at

6 The 1961 Census represents a notable departure from the official lack of engagement with the LSI in the ways which it gave a reasoned and transparent account of its classification of the 1652 mother tongues it funds in India, both drawing on and departing from Grierson’s taxonomy, as well as its new initiatives at enumerating bilingualism and trilingualism. However, even the census’s attempts to critically engage with colonial language enumeration did not last, and from 1981, the Census of India became firmly committed to carving “Languages out of mother tongues,” not revealing the methodology by which it rationalises the language name returns it receives and how exactly it uses the LSI taxonomy.

7 Available online at

8 Source:


Agnihotri, Rama Kant (2015): “Constituent Assembly Debates on Language,” Economic & Political Weekly, Vol 50, No 8, pp 47–56.

— (2020): “Linguistic Diversity and Marginality in South Asia,” Handbook of Education Systems in South Asia, P Sarangapani and R Pappu (eds), Singapore: Springer, pp 1–37.

Amin, Shahid (2011): “The Marginal Jotter: Scribe Chaube and the Making of the Great Linguistic Survey of India c. 1890–1920,” IIC Occasional Publication 27, New Delhi: India International Centre.

Babu, Hany (2017): “Breaking the Chaturvarna System of Languages: The Need to Overhaul the Language Policy,” Economic & Political Weekly, Vol 52, No 23, pp 112–19.

Baines, Jervoise A (1893): General Report on the Census of India1891, London: HMSO.

Goswami, Manu (2004): Producing India: From Colonial Economy to National Space, Chicago: Chicago University Press.

Grierson, George A (1927): Linguistic Survey of India, Vol 1, Pt 1: Introductory, Calcutta: Government of India Central Publications Branch.

Gupta, R S and Anvita Abbi (1995): “The Eighth Schedule: A Critical Introduction,” Language and the State: Perspectives on the Eighth Schedule, R S Gupta, Anvita Abbi and Kailash S Aggarwal (eds), New Delhi: Creative Books, pp 1–7.

Kidwai, Ayesha (2017): “Languages or Mother Tongues? India’s Linguistic Diversity,” The JNU Lectures on Nationalism, Janaki Nair (ed), Delhi: HarperCollins India.

— (2019): “The People’s Linguistic Survey of India Volumes: Neither Linguistics, Nor a Successor to Grierson’s LSI, but Still a Point of Reference,” Review essay in Social Change, Vol 49, No 1, pp 154–59.

Majeed, Javed (2018a): Nation and Region in Grierson’s Linguistic Survey of India, New Delhi: Routledge India.

— (2018b): Colonialism and Knowledge in Grierson’s Linguistic Survey of India, New Delhi: Routledge India.

Nehru, Jawaharlal (1936): An Autobiography, New Delhi: Jawaharlal Nehru Memorial Fund, 1982.

— (1946): The Discovery of India, New Delhi: Oxford University Press, 1989.

Nigam, R C (1964): “Introductory Note” to Language Tables, A Mitra, Census of India, 1961, Vol I, part II c(ii), Delhi: Manager of Publications.

Pandit, Prabodh B (1975): “The Linguistic Survey of India—Perspectives on Language Use,” Language Surveys in Developing Nations: Papers and Reports on Sociolinguistic Surveys, Sirarpi Ohannessian, Charles A Ferguson and Edgar C Polome (eds), Arlington, Va: Center for Applied Linguistics, pp 71–85.

Singh, Ram Adhar (1969): Inquiries into the Spoken Languages of India from Early Times to Census of India 1901, Language Monographs, No 1, Census of India 1961, Vol 1, P XI-C(i), Delhi: Manager of Publications.


Updated On : 20th Oct, 2020


(-) Hide

EPW looks forward to your comments. Please note that comments are moderated as per our comments policy. They may take some time to appear. A comment, if suitable, may be selected for publication in the Letters pages of EPW.

Back to Top