The recent announcement by the British Library of the “Two Centuries of Indian Print” project is both a beginning and an end. The beginning – of an ambitious curriculum to digitise all unique South Asian print titles in its collection till 1914 and make them freely accessible. But also the end of a much humbler project begun over a decade ago.

How it all began

In 2003, the newly set-up School of Cultural Texts and Records at Jadavpur University, Kolkata, began creating a database of all Bengali books in print till 1867. Researchers went to a dozen or so important depositories of Bengali books and started physically checking every title from that period. And wrote out full title-page transcriptions. By hand. And measured the pages with rulers. And wrote complex binding formulae.

All this took a while. Several years, in fact. Researchers came and went, occasionally grumbling at the tedium of the work, and the dust in the stacks. Gradually, the project crept past the watershed of 1867 and pottered towards the uplands of the 19th century and the foothills of the 20th.

As it did, the titles increased exponentially. If there were approximately 2,500 titles till 1867, then the number would double in the next two decades. From chemistry to cookery, economics to erotica, geography to gymnastics – there was no subject under the sun that the world of Bengali print had left untouched.

Building scale

In 2011, talks began at India-UK government levels to digitise large corpora of Indian printed materials. It was decided to pilot with the Bengali language, since the SCTR project had ensured bibliographic control. In other words, we knew which depositories held unique titles or which copy was the best for digitisation.

But this was not all that the project would be about. Plain-vanilla imaging was easy-peasy: the challenge was to devise reliable OCR software for Indic script, and generate robust metadata. In addition, a number of stakeholders had to be brought on the same page, or at least, volume.

Finally, there was the question of funding, since there had to be some heavy-duty imaging done to get the project off the skids. It is a testament to the British Library’s commitment to the project that over four years after it was first mooted, the project is finally ready to roll.

Metadata, not serendipity

One of the much-vaunted joys of research is the chance discovery, and much has been made of fortuitous finds by scholars who have been lucky enough to find something like a first draft of some literary classic tucked away in a dusty archive. However, digitisation yields such volumes of data that it is humanly impossible to sift through it all. Thus the (lot less romantic) solution is metadata – machine-and-human readable descriptions that make content searchable and discoverable, which has now migrated from the card catalogue to database.

A unique feature of this particular project, however, is that it already has pages of copperplate metadata at its disposal, which have for years been quietly languishing in leather bound ledgers in the basement of the British Library. These are the handwritten Quarterly Lists, a colonial legacy that enumerated all the books published in the region, along with the identities and locations of their publishers – a wealth of data that this digital initiative will help unlock.

This will form the basis of a Bengali book trade index (similar lists exist for the Scottish and the British publishing industries), and these forgotten names and long-defunct establishments will then be plotted on three temporally discrete maps from the time period (using the same technologies that underpins applications like Google Earth) to demonstrate how the Bengali printing industry evolved in the nineteenth century.

Printing and publishing firms in the time period often clustered in the same areas – in the case of Calcutta, the famed Battala area in the north by the river encompassing a warren of print-houses, block-makers and binderies, their proximity to water being essential for transport and paper-making.

Plotting the growth of the industry on these maps and cross-referencing them with information such as revenue collection and civil administration wards and proximity to educational and governmental institutions, immediately provides context for this information. The digital tools that will be created by the project will allow researchers to arrange and interrogate the data in ways which were not previously possible in non-digital environments.

That the imperial condition shaped the contours of the incipient Bengali industry is significant, for its nature was determined by interventions of colonial power that manifested itself through legislation, audience and industrial expertise. Recent scholarship in the field has conflicting accounts of the impact of imperial print culture on the region.

Although it is undeniable that the technology introduced into the region irrevocably changed how text was received, there are varying opinions on the degree to which contemporary print culture was bound to the colonial agenda. What this project will offer is an organised dataset that can be manipulated by or sliced into at any point by academics working in any related discipline: textual scholars, social historians, literary critics – and offer definitive evidence of the reciprocal nature of the knowledge exchange and dissemination between India and the empire.

The pilot of "Two Centuries of Indian Print" will run through 2017-'18, and aims to digitise all Bengali printed titles held by the British Library and the libraries of the School of Oriental and African Studies. Helmed by Nur Sobers-Khan and Catherine Eagleton from the British Library, the chief Indian partners are the two authors of this article.

Abhijit Gupta teaches at Jadavpur University, Kolkata. Padmini Ray Murray teaches at the Srishti Institute of Art, Design and Technology, Bengaluru.