In 1997, Babar Ali Mandal migrated from Nadia district in West Bengal to Mumbai in search of work.
In the three decades since, he has made Mumbai his home. He runs a small workshop to make footwear in Govandi, a slum cluster located in the east of the Maharashtra capital.
In his free time, he often scrolls through social media sites.
What he sees is distressing. “A lot of people from my state are being targeted for speaking Bengali,” he said. The fear of being labelled Bangladeshi has intensified in the last one year, he added.
As Scroll has reported, the police in several states ruled by the Bharatiya Janata Party, including Maharashtra, have over the last year accused Indian citizens of being Bangladeshis and thrown them across the border, without giving them time or opportunity to prove their citizenship – in violation of the Centre’s own rules.
Maharashtra Chief Minister Devendra Fadnavis has gone a step further. Last week, at a media event, Fadnavis said that the state government is building an artificial intelligence tool to help it detect Bangladeshi immigrants.
According to a report in Hindustan Times, which cited officials aware of the project, the language-based verification AI tool “will analyse speech patterns, tone and linguistic usage to help identify suspected illegal Bangladeshi nationals and Rohingyas in the state”.
It is being developed by the state information technology department in collaboration with IIT Bombay. However, Maharashtra IT minister Ashish Shelar told Scroll that the project is being handled by the chief minister’s office and he did not have the “finer details” about it.
The IIT Bombay spokesperson refused to answer Scroll’s queries about the tool, as did IT secretary Virendra Singh.
But experts warn that the idea of a language-based AI tool to help detect a person’s nationality is inherently flawed and may lead to more harassment of working-class Bengali-speaking migrants in Maharashtra.
‘Accent from geography, not religion’
To start with, experts questioned the existence of two distinct forms of Bengali or Bangla – one spoken in India and the other in Bangladesh, given the history of Partition and the creation of Bangladesh in 1971.
Gorky Chakraborty, faculty at the Institute of Development Studies in Kolkata, pointed out that both events led to millions crossing over into India.
“Millions who migrated from that part of Bangladesh to this part of India still have the same dialect and diction,” Gorky said. “The social continuum has not been ruptured by political borders.”
Others pointed out that Bengali has a wide variety of dialects and accents.
Arijit Mukherjee, an artificial intelligence expert and principal scientist at a multinational research organisation, pointed out that while the Bengali language has “regional variations”, they are too broad for a technological tool to distinguish.
“There are no unique markers to distinguish the language spoken in India from the one spoken in Bangladesh,” Mukherjee said.
He added: “In order to make such a determination, the tool has to be fed a huge volume of data. It will need thousands of audio clips from each region to be able to give credible results.”
The IIT-Bombay team has so far not disclosed what data they are using to train the AI model.
Adil Hossain, assistant professor at Azim Premji University, said it is impossible to use technology to detect the nationality of people who belong to the same geography. “Be it a Hindu or a Muslim, people will have the same dialect on both sides of the border,” he said.
Hossain, a political anthropologist, was referring to the nine districts of West Bengal that share a border with Bangladesh – North 24 Parganas, Murshidabad, Cooch Behar, Malda and Nadia, among them. For example, a person from Nadia will have the same manner of speaking as someone from Chuadanga district in Bangladesh, which is across the border from Nadia. “The accent and dialect comes from geography not religion,” Hossain noted.
This phenomenon is not limited to Bengal. The large Bengali-speaking population in Assam’s Barak Valley and other states in the North East speak Sylheti, a dialect spoken in Bangladesh’s Sylhet district, Chakraborty pointed out.
Perhaps aware of the challenges, Fadnavis has said that the AI tool being designed by IIT has so far reached 60% accuracy. It will take another six months to be rolled out.
Tool for harassment?
Mukherjee, the scientist, contested the perception that AI is “unerring” and that its decisions are beyond question. “The government is using that belief about AI to push its narrative,” he said. “But this seems like a politically motivated agenda and a case of AI being used for political gains.”
This, he said, is what digital experts have been warning against – the unethical use of AI to discriminate against marginalised people.
As Bengali is spoken in a large region spanning West Bengal and states in the North East such as Meghalaya, Tripura and Assam, a language-based AI tool could become a “tool to terrorise people” from these regions, Mukherjee said.
Chakraborty, too, said he worried that the tool would be disproportionately biased against Bengali Muslims and Bangla-speaking migrant workers. “A labourer who migrates from [Assam or Bengal] will speak in a certain dialect,” he said. “He may not know the local languages in Maharashtra. The chances of harassment are high, because it gets difficult for them to prove their identity.”
Mandal is fearful of exactly this. “My village is 50 km from the Bangladesh border. Our way of speaking Bengali is similar to those across the border,” he said. “We have already seen so many cases of wrongful deportation of Bengalis.”
Last June, four men from West Bengal, living in Mira Bhayander, were picked up by the Maharashtra police and pushed into Bangladesh on suspicion of being illegal immigrants. They were were brought back later.
Mandal added: “It could be me one day.”