The crisis caused by the second wave of the Covid-19 pandemic has led to a revival of interest in mathematical models for the progression of the pandemic. Several modellers have made public predictions for the size and duration of the second wave. These modellers suggest that they can predict the future, and their models carry the imprimatur of science. So it is natural that people are very interested in what they have to say.

However, we would like to explain why such predictions are often misleading. Mathematical models are used successfully in many branches of science. But the models that have been constructed for the pandemic suffer from serious theoretical problems and have a poor empirical record of predicting its course over the past year.

Modellers in India not only uniformly failed to anticipate the emergence of the second wave, there is no evidence that they now have the tools to understand its future trajectory. In some cases, not only do these models represent substantively poor science, they are promoted by modellers who have chosen to discard essential steps in the scientific procedure.

Compartmental models

The most commonly used models are called “compartmental models” and are constructed by dividing the population into groups. For instance, in a very simple model, one might divide the population into a group that is susceptible, a group that is infected and a third that has recovered from the infection. It is, of course, possible to embellish this model by introducing further categories.

The modellers then postulate some rules for the rate of change of the number of people in each group. For instance the “infected” group might grow at a rate controlled by the number of interactions between those who are infected and those who are susceptible.

Simplifications are an important aspect of science, where we invariably seek to isolate the “essential aspects” of the dynamics of a system and discard the inessential details. However, the simplifications made by modellers are too coarse to capture the essential dynamics of a phenomenon as complex as the pandemic.

Most importantly, Indian society cannot be accurately approximated as a set of a few homogeneous units. Even the few details that modellers focus on reflect their own class biases. While some models attempt to stratify the population by age, most models neglect income inequalities. But in a country like India, unequal access to healthcare and the fact that most people do not have adequate resources to practice “social distancing” are not “irrelevant details” that can be ignored in the “first approximation.”

This indifference to social realities is what led modellers to recommend and then subsequently “welcome” the Indian government’s ill-planned lockdown last year.

Even biological phenomena such as mutations in the virus, that may be important for the second wave, are not accounted for in most models.

These constitute basic flaws in the paradigm of the models used for the pandemic. In practice, these models are beset by additional problems.

State of the art

Modellers sometimes use words like “state of the art” to describe their models. But this is a disingenuous technique used to make their work sound more impressive than it is. In reality, the equations used in such models are just a simple set of “ordinary differential equations”. Students are taught how to solve such equations on a computer in any physics or mathematics undergraduate course. So, from a technical viewpoint, models that receive publicity as being “advanced” are often no more advanced than an undergraduate science project.

Second, modellers usually do not have direct data even to fix the parameters within their simplified descriptions. So they try to “back-calculate” these parameters from the available data on the spread of the pandemic. But the available data is very sparse. Therefore, modellers who boast of accounting for many effects and introduce many parameters into their model do not have enough data to uniquely fix their parameters. They are forced to make subjective choices for their parameters, and so their predictions are also subjective.

Third, the data available in India has serious problems. The true number of cases is difficult to know since so many infections are never captured by a test. But it is clear that even deaths have been undercounted. It is almost impossible for modellers to correct this flawed data, and so these flaws propagate through to their conclusions.

All this means that while mathematical models might be useful as tools to gain some qualitative intuition, neither the theory nor the available data is sophisticated enough to make any detailed predictions about the pandemic. Moreover the “policy prescriptions” that emerge from such models involve subjective choices. So they are influenced by the social and class biases of the modellers and should not be treated as “objective” or “neutral”.

Instead of coming to terms with these facts, some members of the modelling community have instead chosen to discard basic scientific norms.

A time-tested element of the scientific method is that if empirical data repeatedly contradicts a theory or a model, the model is discarded. But their failure to predict the course of the pandemic over the past year has left modellers unfazed.

One example is the so-called “supermodel” developed under the aegis of the Indian government. This model divided the first wave of the pandemic into 6 phases and introduced four parameters for each phase leading to a total of 24 parameters. By adjusting the values of these parameters, the modellers were able to fit the existing data. But it is well known that with enough parameters, one can fit any data set; this provides no indication of the strength of the model.

Inaccurate predictions

Not surprisingly, the predictions made by these modellers were completely wrong. In one paper, the modellers boasted that “India is perhaps the only major economy that has managed to get the strategy right” and that “the decisions taken have led to the avoidance of multiple peaks”. This was what the government wanted to hear but the second wave shows that it is far from the truth. In spite of this clear falsification of the model, Manindra Agarwal, one of the authors of this model, has continued to make detailed predictions on social media. These successive predictions continue to meet a poor fate.

Another important convention in science is that before results are communicated to the broader public, they are first written up and shown to other scientists. Here, we are not referring to conventional “peer review” which may take unacceptably long in a pandemic. We are simply referring to the practice of putting out scientific preprints that contain enough details for other scientists to reconstruct and check the model’s predictions.

In this light, consider the widely publicised prediction, made by Gautam Menon of Ashoka University in an interview with Karan Thapar, that the second wave “should peak in mid-May with between 500,000 and 600,000 cases per day”. Menon did not specify which model he used to make these predictions. We also encountered several media references to “calculations by … Menon and his team at Ashoka University” regarding the course of the pandemic.

Although Menon has been active on social media and has granted numerous media interviews, we were unable to locate any academic papers or preprints authored by Menon where we could read about the details of such calculations.

Perhaps Menon was referring to a model called “Indscisim”, developed by a small group of scientists from different institutions. The group behind Indscisim has also announced several results to the media without writing up a preprint describing their model. The basic structure of the model is known: Indscisim is a compartmental model with about 14 parameters. The model’s website itself shows how this large number of parameters can be adjusted to obtain a wide range of predictions. However, these details are insufficient to reconstruct Menon’s projections since we do not know what data was fed to the model, or the procedure that was used to fit its parameters.

To see why this is important, one only has to revisit predictions released by this group for the course of the pandemic in Chennai last July in the form of a “slide presentation”. Although the slides did not contain many details, it was immediately obvious that the calculations contained a serious and basic error: the population of Chennai had been taken to be 4.6 million, whereas its true population is significantly larger.

Evidence of this error can still be seen on the model’s website (see slides labelled v1.3a) where the incorrect population is mentioned in the main text, and a correction is provided in brackets. In this case as well, the group rushed to issue a press-release before subjecting its results to scrutiny, and this press-release with incorrect figures continues to be available for public view.

In the absence of transparent details about Menon’s latest predictions, it is impossible to rule out the possibility that they involve a similar error.

Many people look to scientists as a source of reliable information. But the truth is that mathematical modellers are just guessing when they make predictions about the future. Any other informed person can examine the available data and trends and make a guess that is likely to be as good or bad. Those modellers who have taken on the role of “scientific soothsayers” in India are not only misleading people, they are damaging the credibility of science and undermining the good work that other scientists are doing to combat the pandemic.

Alok Laddha is a physicist at the Chennai Mathematical Institute and Suvrat Raju is a physicist at the the International Centre for Theoretical Sciences (Bengaluru). The authors would like to emphasise to readers that they are not modellers themselves, and this article is written from the point of view of concerned scientists and citizens who are outside the field.