On March 17, news broke that Cambridge Analytica, a data analytics firm that worked with Donald Trump’s election campaign, had harvested over 50 million users’ Facebook data without permission. The firm combined a range of users’ data, including “non-sensitive” information such as status updates and Facebook likes, with other sources of information (such as electoral rolls) to build psychographic profiles and target people based on their preferences and susceptibilities. As Christopher Wylie, a former data scientist at Cambridge Analytica and the whistle-blower behind the revelations, explained, the firm combined “micro-targeting” with constructs from psychology so that “we would not only be targeting you as a voter, we would be targeting you as a personality”.

These revelations raise significant questions about how data-mining techniques impact society, especially given their use during campaigns in the 2016 United States presidential elections and the United Kingdom’s Brexit referendum. They also highlight foundational questions around the regulation of data in the age of big data – an issue that we are in the midst of a national conversation about in India.

In particular, we are confronting a reality where information about us from various sources is increasingly linked and easily shareable, serving to reveal aspects of ourselves that we may not intend to disclose. As these data-mining techniques become reality, do we need to change how we think about what is sensitive data and what isn’t?

Play

Inadequate regulation

The current regulatory approach to protecting personal data in India (under the Information Technology Act, 2000) follows a “list-based” approach. It identifies six types of “sensitive personal data”: (1) passwords, (2) financial information (such as bank account and payment instrument details), (3) health condition, (4) sexual orientation, (5) medical records and history, and (6) biometric information. This list is set out in rules passed under the Information Technology Act.

The rules require entities handling such data to have “reasonable security practices and procedures” in place before collecting the information. Parties are free to agree on their own rules relating to such data, including any security standards or privacy policy. Entities must also take prior written consent from a person whose sensitive personal data they are collecting, and use it only for a specified purpose. In practice, however, prior written consent can be quite meaningless – as anyone who has mindlessly clicked on an “I Agree” button while signing up to lengthy terms and conditions will testify to. Users often have neither the choice nor the bandwidth to do anything other than sign away their consent for their information to be used for purposes often unclear at the time of entering into the contract.

In any case, this old list-based approach appears to be totally incongruous with the world today, where we confront data mining techniques like the one used by Cambridge Analytica. No doubt, in this case, there are also questions around the impropriety of collecting the data of Facebook users without their knowledge and using it to deploy covert advertising techniques for messages that are in effect election canvassing. But it also showcases that lines are blurring when it comes to trying to identify the types of information that are sensitive enough to expose a person to a vulnerability.

Several service providers around the world and across sectors are using similar data analytics to understand their users better. More and more providers actively aspire to a “single customer view” that provides a deep profile of all the data they hold on a particular individual. Our “digital breadcrumbs” – Facebook likes, browsing histories – would not typically be considered sensitive information in our current legal understanding. However, new data analysis techniques allow sensitive details such as religious beliefs, sexual orientation, ethnic or caste information to be revealed from other proxies or based on patterns gleaned from non-sensitive data of individuals without their knowledge or consent. Most of the data points used in such techniques would not be considered “sensitive personal data” under the current law or under the provisional view of an expert committee expressed in a white paper on a future data protection regime for India.

The Information Technology Act identifies six types of personal data as sensitive – passwords, financial information, health condition, sexual orientation, medical records and history, and biometrics. (Credit: Reuters)
The Information Technology Act identifies six types of personal data as sensitive – passwords, financial information, health condition, sexual orientation, medical records and history, and biometrics. (Credit: Reuters)

New model for data protection regulation

The time has come to reconsider our dogma around personal data regulation. The objective of the data protection law is to protect individuals themselves, not merely a subset of their personal data. To uphold this core objective, we can and should ensure that protection is offered to all personal information by which an individual is identified or through which they are identifiable. Doing so would focus regulation more appropriately on use of personal information in the context of the provision of a service, rather than blindly forcing organisations to “ring fence” certain types of data. The use of your medical records, for instance, would be necessary and proportionate to the service your hospital provides you, but not the use by a third-party recruitment firm discussing your employment with a potential employer.

It is important to test the legitimacy of the use of any personal information in the context of its use. It is both possible and important to do so, as the model fleshed out in our response to public consultation on the white paper for data protection explains. We have proposed that a future law provide protection to all personally identifiable information. Whether such personal data can be used in a given situation will then hinge on a “legitimate purpose” test we have proposed – which will allow entities to use personal data where to do so is lawful, necessary and proportionate without overriding the individuals’ interest (see page 42 of our full response). This test is not the European Union’s legitimate interest standard but one we believe can go beyond it to fix some of its shortcomings. Finally, these principles should be backed by an enforcement framework based on responsive regulation. This would make these protections more meaningful, unlike the current state of affairs. Such an approach has the benefit of allowing the use of new types of data created by individuals for clearly identified legitimate purposes ex ante, and it will also incentivise the use of de-identification techniques like anonymisation that allow entities to retain personal data without infringing privacy or data protection rights.

We must explore new ways to think about these emerging issues as we change how we interpret our increasingly digital lives. India has a unique opportunity to learn from others’ mistakes, especially as episodes like the Cambridge Analytica case and our own stories of data breaches come to light. We must use these lessons in our industry’s data practices and our national laws in the months and years ahead.

Malavika Raghavan works on emerging issues for regulation around consumer protection and finance. She is currently project head, Future of Finance Initiative at Dvara Reasearch.