After the personal data protection bill, India must focus on utilising non-personal data

Individuals increasingly interact with businesses online, leaving behind a trail of digital data. So far, much of the debate around how this data should be collected and used by enterprises has been about protecting personal data. This includes information like name, address, financial information, or medical history that can be traced to the individuals who generate them.

India may soon get a personal data protection law, with the union cabinet approving a bill on December 4.

While the personal data of customers is protected by privacy laws across the world, non-personal data, sometimes referred to as NPD, remains largely unregulated. NPD includes anonymised data such as climate trends collected by a weather app or commuter patterns gathered by a cab aggregator. A framework for NPD may be far more complex to create than a personal data protection law.

In India, a committee chaired by Kris Gopalakrishnan, co-founder of the IT services giant Infosys, is currently examining issues relating to NPD.

Suggestions for a law to govern NPD stem from two key ideas. First, that non-personal data has economic value, which should be leveraged for the financial benefit of Indian companies. Second, aggregated data is a collective resource, which should be unlocked for better governance. Traffic patterns gathered by cab aggregators, for instance, can help in better traffic management.

Consequently, a slew of government documents in the last two years have underscored the need to make NPD more accessible.

Official developments

In August 2017, India’s telecom regulator sought to unlock the economic value of data through a consultation paper on privacy and ownership of data gathered by telecom companies.

The paper sought stakeholder inputs on creating a “data sandbox,” which is a mechanism through which entities can contribute anonymised data sets that would help others develop new products. Anonymised data is sourced from individuals, but all identifiers that can trace the individuals are removed or masked.

A year later, the government think tank NITI Aayog, in a discussion paper National Strategy for Artificial Intelligence, suggested that the concentration of data in the hands of a few players was an entry barrier for startups and recommended marketplaces to spur data sharing.

Besides the economic argument, there have been suggestions that data must be shared for good governance and planning. NITI Aayog’s AI strategy also suggested that corporates may be required to share data for social good.

The justice BN Srikrishna-led committee, which recommended India’s draft personal data protection bill, had also flirted with the idea of “community data.”

In its final report to the government, the committee suggested that data sourced from multiple individuals and aggregated by an entity took on a different character and the resultant community data may be worthy of protection separately from personal data. At the same time, the report acknowledged that the entity that has aggregated the data has intellectual property rights over it.

NPD panel

The Gopalakrishnan committee for NPD was formed by India’s ministry of electronics and information technology, or MeitY, in September this year. The ministry’s circular highlighted both the economic and social aspects of the arguments.

Recognising the economic importance of data, it mentions aggregated data, anonymised data, e-commerce data, and AI training data – a sign that the Gopalakrishnan committee will likely explore what each of these categories mean and ways to improve data sharing. The circular also notes that “privately collected digital data” could be necessary for policy making, governance, and public service delivery.

This raises the question: will free data access generate economic value? And how much value would datasets, collected and curated for a particular purpose, hold when shared with government bodies for unrelated purposes?

The Gopalakrishnan committee should examine the risks of asking companies to share NPD with anyone seeking access to it. Exchange of datasets already takes place through voluntary data marketplaces. To better identify useful approaches to data sharing and access, the committee could first consider examining existing data marketplaces and open data initiatives.

The risks

In encouraging sharing and access, one must also not lose sight of privacy and security risks.

Though any data that cannot identify individuals is understood as NPD, this could cover a vast array of information, including companies’ intellectual property or confidential information. All anonymised data need not be secure. Researchers have been able to re-identify individuals from seemingly anonymised data by combining datasets or using new techniques.

Re-identification risks have emerged with open data initiatives as well. Anonymised datasets released by governments for research, have been processed to identify individuals. More focused data sharing or collaborations for particular policy goals may help protect privacy.

The governance of NPD presents complex, new considerations that are distinct from the concerns relevant to personal data regulation. Given this complexity, the committee should consider holding a wide public consultation that will help bring different perspectives to the table.

With inputs from Nehaa Chaudhari, Director of Public Policy at Ikigai Law.

This article first appeared on Quartz.

We welcome your comments at letters@scroll.in.