How an Indian student made Sanskrit’s ‘language machine’ work for the first time in 2,500 years

A grammatical problem which has defeated Sanskrit scholars since the 5th Century BC has finally been solved by an Indian PhD student at the University of Cambridge.

Rishi Rajpopat (St John’s College) made the breakthrough by decoding a rule taught by “the father of linguistics” Pāṇini.

The discovery makes it possible to ‘derive’ any Sanskrit word – to construct millions of grammatically correct words including ‘mantra’ and ‘guru’ – using Pāṇini’s revered ‘language machine’ which is widely considered to be one of the greatest intellectual achievements in history.

Stamp issued by India in 2004 to commemorate Pāṇini. Credit: Government of India

Leading Sanskrit experts have described Rajpopat’s discovery as ‘revolutionary’, and it could now mean that Pāṇini’s grammar can be taught to computers for the first time.

While researching for his PhD thesis, published on December 15, Rajpopat decoded a 2,500 year old algorithm which makes it possible, for the first time, to accurately use Pāṇini’s ‘language machine’.

Pāṇini’s system – 4,000 rules detailed in his renowned work, the Aṣṭādhyāyī, which is thought to have been written around 500 BC – is meant to work like a machine. Feed in the base and suffix of a word and it should turn them into grammatically correct words and sentences through a step-by-step process.

Page from an 18th-century copy of the Dhātupāṭha of Pāṇini (MS Add.2351) held by Cambridge University Library.

Until now, however, there has been a big problem. Often, two or more of Pāṇini’s rules are simultaneously applicable at the same step leaving scholars to agonise over which one to choose.

Solving so-called ‘rule conflicts’, which affect millions of Sanskrit words including certain forms of ‘mantra’ and ‘guru’, requires an algorithm.

Pāṇini taught a metarule – termed by Rajpopat ‘1.4.2 vipratiṣedhe paraṁ kāryam’ – to help us decide which rule should be applied in the event of ‘rule conflict’ but for the last 2,500 years, scholars have misinterpreted this metarule meaning that they often ended up with a grammatically incorrect result.

In an attempt to fix this issue, many scholars laboriously developed hundreds of other metarules but Rajpopat shows that these are not just incapable of solving the problem at hand – they all produced too many exceptions – but also completely unnecessary. Rajpopat shows that Pāṇini’s ‘language machine’ is ‘self-sufficient’.

Pāṇini had an extraordinary mind and he built a machine unrivalled in human history. He didn’t expect us to add new ideas to his rules. The more we fiddle with Pāṇini’s grammar, the more it eludes us.
— Rishi Rajpopat

Traditionally, scholars have interpreted Pāṇini’s metarule as meaning: in the event of a conflict between two rules of equal strength, the rule that comes later in the grammar’s serial order wins.

Rajpopat rejects this, arguing instead that Pāṇini meant that between rules applicable to the left and right sides of a word respectively, Pāṇini wanted us to choose the rule applicable to the right side.

Employing this interpretation, Rajpopat found Pāṇini’s language machine produced grammatically correct words with almost no exceptions.

Take ‘mantra’ and ‘guru’ as examples.

In the sentence ‘devāḥ prasannāḥ mantraiḥ’ (‘The Gods [devāḥ] are pleased [prasannāḥ] by the mantras [mantraiḥ]’) we encounter ‘rule conflict’ when deriving mantraiḥ ‘by the mantras’.

The derivation starts with ‘mantra + bhis’. One rule is applicable to left part ‘mantra’ and the other to right part ‘bhis’. We must pick the rule applicable to the right part ‘bhis’, which gives us the correct form ‘mantraiḥ’.

And in the the sentence ‘jñānaṁ dīyate guruṇā’ (‘Knowledge [jñānaṁ] is given [dīyate] by the guru [guruṇā]’) we encounter rule conflict when deriving guruṇā ‘by the guru’.

The derivation starts with ‘guru + ā’. One rule is applicable to left part ‘guru’ and the other to right part ‘ā’.

We must pick the rule applicable to the right part ‘ā’, which gives us the correct form ‘guruṇā’.

Cambridge University Library holds the Pārameśvaratantra, a scripture of the Śaiva Siddhā. Written on palm leaf around 828 CE, it is one of the oldest known dated Sanskrit manuscripts (MS Add.1049.1).

Eureka moment

As Rajpopat struggled to make progress, his supervisor at Cambridge, Professor Vincenzo Vergiani, Professor of Sanskrit, gave him some prescient advice: “If the solution is complicated, you are probably wrong.”

“Six months later, I had a eureka moment,” Rajpopat says. “I was almost ready to quit, I was getting nowhere. So I closed the books for a month and just enjoyed the summer, swimming, cycling, cooking, praying and meditating.

“Then, begrudgingly I went back to work, and, within minutes, as I turned the pages, these patterns starting emerging, and it all started to make sense.

“At that moment, I thought to myself, in utter astonishment: For over two millennia, the key to Pāṇini’s grammar was right before everyone’s eyes but hidden from everyone’s minds!”

“There was a lot more work to do but I’d found the biggest part of the puzzle. Over the next few weeks I was so excited, I couldn’t sleep and would spend hours in the library including in the middle of the night to check what I’d found and solve related problems. That work took another two and half years.”

The Vākyapadīya of Bhartṛhari (5th century CE), a treatise on the philosophy of language belonging to the Pāṇinian school of grammar. Credit: Cambridge University Library (MS Add.876).

Significance

Sanskrit is an ancient and classical Indo-European language from South Asia. It is the sacred language of Hinduism but also the medium through which much of India’s greatest science, philosophy, poetry and other secular literature have been communicated for centuries.

While only spoken in India by an estimated 25,000 people today, Sanskrit has growing political significance in India and has influenced many other languages and cultures around the world.

Some of the most ancient wisdom of India has been produced in Sanskrit, and we still don’t fully understand what our ancestors achieved.
— Rishi Rajpopat

“We’ve often been led to believe that we’re not important, that we haven’t brought enough to the table. I hope this discovery will infuse students in India with confidence, pride and hope that they too can achieve great things.”

Vincenzo Vergiani, Professor of Sanskrit at the University of Cambridge, says: “My student Rishi has cracked it – he has found an extraordinarily elegant solution to a problem which has perplexed scholars for centuries. This discovery will revolutionise the study of Sanskrit at a time when interest in the language is on the rise.”

A page from the Bhagavad Gītā. Credit: The British Library

A major implication of Rajpopat’s discovery is that now we have the algorithm that runs Pāṇini’s grammar, we could potentially teach this grammar to computers.

“Computer scientists working on Natural Language Processing gave up on rule-based approaches over 50 years ago,” Rajpopat says.

“So teaching computers how to combine the speaker’s intention with Pāṇini’s rule-based grammar to produce human speech would be a major milestone in the history of human interaction with machines, as well as in India’s intellectual history.”

Pāṇini is thought to have lived in a region in what is now north-west Pakistan and south-east Afghanistan.

Rishi Rajpopat was born in a suburb of Mumbai in 1995. Rajpopat learnt Sanskrit in high school and Pāṇini’s Sanskrit grammar informally from a retired Indian professor at no charge whilst pursuing his Bachelors in Economics in Mumbai.

Rishi Rajpopat was awarded his doctorate in January. Credit: Rahil Rajpopat

Following a Masters at Oxford, for which he raised money by writing to hundreds of potential donors, Rajpopat started his PhD at St John’s College and Cambridge’s Faculty of Asian and Middle Eastern Studies in 2017 on a full scholarship funded by the Cambridge Trust and the Rajiv Gandhi Foundation. He was awarded his doctorate in January. He recently joined the School of Divinity at the University of St Andrews.

Cambridge has a long history of studying Sanskrit, and Cambridge University Library holds a significant collection of Sanskrit manuscripts.

This article first appeared on the website of University of Cambridge.

We welcome your comments at letters@scroll.in.

Get the app

ANDROID iOS