Cancer Care

What Netflix can teach us about treating cancer

Cancer researchers dream of offering personalised treatments to patients. Can they get there using the same math that drives Netflix recommendations?

Two years ago, former President Barack Obama announced the Precision Medicine initiative in his State of the Union Address. The initiative aspired to a “new era of medicine” where disease treatments could be specifically tailored to each patient’s genetic code.

This resonated soundly in cancer medicine. Patients can already manage their cancer with therapies that target the specific genes that are altered in their particular tumor. For example, women with a type of breast cancer caused by the amplification of gene HER2 are often treated with a therapeutic called herceptin. Because these targeted therapeutics are specific to cancer cells, they tend to have fewer side effects than traditional cancer treatments with chemotherapy or radiation.

However, such treatments are not available for most cancer patients. In many cancers, the specific genetic alterations that are responsible for a cancer remain unknown. To create individualized cancer treatments, we must know more about the functional genetic alterations.

With data on cancer genetics growing rapidly, mathematics and statistics can now help unlock the hidden patterns in this data to find the genes that are responsible for an individual’s cancer. With this knowledge, physicians can select appropriate treatments that block the action of these genes to personalize therapies for individual patients. My research aims to improve precision medicine in cancer – by building on the same methods that have been used to find patterns in Netflix movie ratings.

Sifting through the data

Today, there is unprecedented public access to cancer genetics data. These data come from generous patients who donate their tumor samples for research. Scientists then apply sequencing technologies to measure the mutations and activity in each of the 20,000 genes in the human genome.

All these data are a direct result of the Human Genome Project in 2003. That project determined the sequence for all the genes that make up healthy human DNA. Since the completion of that project, the cost of sequencing the human genome has more than halved every year, surpassing the growth of computing power described in Moore’s Law. This cost reduction enables researches to collect unprecedented genetics data from cancer patients.

Most scientific studies on cancer genetics performed worldwide release their data to a centralised, public database provided by the US National Institutes of Health National Library of Medicine. The NIH National Cancer Institute and National Human Genome Research Institute have also freely released genetic data from over 11,000 tumors in 33 cancer types through a project called The Cancer Genome Atlas.

Every biological function – from extracting energy from food to healing a wound – results from activity in different combinations of genes. Cancers hijack the genes that enable people to grow to adulthood and that protect the body from the immune system. Researchers dub these the “hallmarks of cancer.” This so-called gene dysregulation enables a tumor to grow uncontrollably and form metastases in distant organs from the original tumor site.

Researchers are actively using these public data to find the set of gene alterations that are responsible for each tumor type. But this problem is not as simple is identifying a single dysregulated gene in each tumor. Hundreds, if not thousands, of the 20,000 genes in the human genome are dysregulated in cancer. The group of dysregulated genes varies in each patient’s tumor, with smaller sets of commonly reused genes enabling each cancer hallmark.

Precision medicine relies on finding the smaller groups of dysregulated genes that are responsible for biological function in each patient’s tumor. But, genes may have multiple biological functions in different contexts. Therefore, researchers must uncover a set of “overlapping” genes that have common functions in a set of cancer patients.

Linking gene status to function requires complex mathematics and immense computing power. This knowledge is essential to predict of outcome to therapies that would block the function of these genes. So, how can we uncover those overlapping features to predict individual outcomes for patients?

What Netflix can teach us

Fortunately for us, this problem has already been solved in computer science. The answer is a class of techniques called “matrix factorization” – and you’ve likely already interacted with these techniques in your everyday life.

In 2009, Netflix held a challenge to personalise movie ratings for each Netflix user. On Netflix, each user has a distinct set of ratings of different movies. While two users may have similar tastes in movies, they may vary wildly in specific genres. Therefore, you cannot rely on comparing ratings from similar users.

Instead, a matrix factorization algorithm finds movies with similar ratings among a smaller group of users. The group of users will vary for each movie. The computer associates each user with a group of movies to a different extent, based upon their individual tastes. The relationships among users are referred to as “patterns.” These patterns are learned from the data, and may find common rankings unforeseen by movie genre alone – for example, users may share a preference for a particular director or actor.

Genevieve Stein-O'Brien, CC BY
Genevieve Stein-O'Brien, CC BY

The same process can work in cancer. In this case, the measurements of gene dysregulation are analogous to movie ratings, movie genres to biological function and users to patients’ tumors. The computer searches across patient tumors to find patterns in gene dysregulation that cause the malignant biological function in each tumor.

From movies to tumors

The analogy between movie ratings and cancer genetics breaks down in the details. Unless they are minors, Netflix users are not constrained in the movies they watch. But, our bodies instead prefer to minimise the number of genes used for any single function. There are also substantial redundancies between genes. To protect a cell, one gene may easily substitute for another to serve a common function. Gene functions in cancer are even more complex. Tumors are also highly complex and rapidly evolving, depending upon random interactions between the cancer cells and the adjacent healthy organ.

To account for these complexities, we have developed a matrix factorization approach called Coordinated Gene Activity in Pattern Sets – or CoGAPS for short. Our algorithm accounts for biology’s minimalism by incorporating as few genes as possible into the patterns for each tumor.

Different genes can also substitute for one another, each serving a similar function in a different context. To account for this, CoGAPS simultaneously estimates a statistic for the so-called “patterns” of gene function. This allows us to compute the probability of each gene being used in each biological function in a tumor.

For example, many patients take a targeted therapeutic called cetuximab to prolong survival in colorectal, pancreatic, lung and oral cancers. Our recent work found that these patterns can distinguish gene function in cancer cells that respond to the targeted therapeutic agent cetuximab from those that do not.

The future

Unfortunately, cancer therapies that target genes usually cannot cure a patient’s disease. They can only delay progression for a few years. Most patients then relapse, with tumors that are no longer responsive to the treatment.

Our own recent work found that the patterns that distinguish gene function in cells that are responsive to cetuximab include the very genes that give rise to resistance. Emerging immunotherapies are promising and appear to cure some cancers. Yet, far too often, patients with these treatments also relapse. New data that track the cancer genetics after treatment is essential to determine why patients no longer respond.

Along with these data, cancer biology also requires a new generation of scientists who can bridge mathematics and statistics to determine the genetic changes occurring over time in drug resistance. In other fields of mathematics, computer programs are able to forecast long-term outcomes. These models are used commonly in weather prediction and investment strategies.

In these fields and my own previous research, we have found that updates to the models from large datasets – such as satellite data in the case of weather – improve long-term forecasts. We have all seen the effect of these updates, with weather predictions improving the closer that we are to a storm.

Just as tools from computer science used can be adapted to both movie recommendations and cancer, the future generation of computational scientists will adopt prediction tools from an array of fields for precision medicine. Ultimately, with these computational tools, we hope to predict tumors’ response to therapy as commonly as we predict the weather, and perhaps more reliably.

Elana Fertig, Assistant Professor of Oncology Biostatistics and Bioinformatics, Johns Hopkins University.

This article first appeared on The Conversation.

We welcome your comments at letters@scroll.in.
Sponsored Content BY 

India’s urban water crisis calls for an integrated approach

We need solutions that address different aspects of the water eco-system and involve the collective participation of citizens and other stake-holders.

According to a UN report, around 1.2 billion people, or almost one fifth of the world’s population, live in areas where water is physically scarce and another 1.6 billion people, or nearly one quarter of the world’s population, face economic water shortage. They lack basic access to water. The criticality of the water situation across the world has in fact given rise to speculations over water wars becoming a distinct possibility in the future. In India the problem is compounded, given the rising population and urbanization. The Asian Development Bank has forecast that by 2030, India will have a water deficit of 50%.

Water challenges in urban India

For urban India, the situation is critical. In 2015, about 377 million Indians lived in urban areas and by 2030, the urban population is expected to rise to 590 million. Already, according to the National Sample Survey, only 47% of urban households have individual water connections and about 40% to 50% of water is reportedly lost in distribution systems due to various reasons. Further, as per the 2011 census, only 32.7% of urban Indian households are connected to a piped sewerage system.

Any comprehensive solution to address the water problem in urban India needs to take into account the specific challenges around water management and distribution:

Pressure on water sources: Rising demand on water means rising pressure on water sources, especially in cities. In a city like Mumbai for example, 3,750 Million Litres per Day (MLD) of water, including water for commercial and industrial use, is available, whereas 4,500 MLD is needed. The primary sources of water for cities like Mumbai are lakes created by dams across rivers near the city. Distributing the available water means providing 386,971 connections to the city’s roughly 13 million residents. When distribution becomes challenging, the workaround is to tap ground water. According to a study by the Centre for Science and Environment, 48% of urban water supply in India comes from ground water. Ground water exploitation for commercial and domestic use in most cities is leading to reduction in ground water level.

Distribution and water loss issues: Distribution challenges, such as water loss due to theft, pilferage, leaky pipes and faulty meter readings, result in unequal and unregulated distribution of water. In New Delhi, for example, water distribution loss was reported to be about 40% as per a study. In Mumbai, where most residents get only 2-5 hours of water supply per day, the non-revenue water loss is about 27% of the overall water supply. This strains the municipal body’s budget and impacts the improvement of distribution infrastructure. Factors such as difficult terrain and legal issues over buildings also affect water supply to many parts. According to a study, only 5% of piped water reaches slum areas in 42 Indian cities, including New Delhi. A 2011 study also found that 95% of households in slum areas in Mumbai’s Kaula Bunder district, in some seasons, use less than the WHO-recommended minimum of 50 litres per capita per day.

Water pollution and contamination: In India, almost 400,000 children die every year of diarrhea, primarily due to contaminated water. According to a 2017 report, 630 million people in the South East Asian countries, including India, use faeces-contaminated drinking water source, becoming susceptible to a range of diseases. Industrial waste is also a major cause for water contamination, particularly antibiotic ingredients released into rivers and soils by pharma companies. A Guardian report talks about pollution from drug companies, particularly those in India and China, resulting in the creation of drug-resistant superbugs. The report cites a study which indicates that by 2050, the total death toll worldwide due to infection by drug resistant bacteria could reach 10 million people.

A holistic approach to tackling water challenges

Addressing these challenges and improving access to clean water for all needs a combination of short-term and medium-term solutions. It also means involving the community and various stakeholders in implementing the solutions. This is the crux of the recommendations put forth by BASF.

The proposed solutions, based on a study of water issues in cities such as Mumbai, take into account different aspects of water management and distribution. Backed by a close understanding of the cost implications, they can make a difference in tackling urban water challenges. These solutions include:

Recycling and harvesting: Raw sewage water which is dumped into oceans damages the coastal eco-system. Instead, this could be used as a cheaper alternative to fresh water for industrial purposes. According to a 2011 World Bank report, 13% of total freshwater withdrawal in India is for industrial use. What’s more, the industrial demand for water is expected to grow at a rate of 4.2% per year till 2025. Much of this demand can be met by recycling and treating sewage water. In Mumbai for example, 3000 MLD of sewage water is released, almost 80% of fresh water availability. This can be purified and utilised for industrial needs. An example of recycled sewage water being used for industrial purpose is the 30 MLD waste water treatment facility at Gandhinagar and Anjar in Gujarat set up by Welspun India Ltd.

Another example is the proposal by Navi Mumbai Municipal Corporation (NMMC) to recycle and reclaim sewage water treated at its existing facilities to meet the secondary purposes of both industries and residential complexes. In fact, residential complexes can similarly recycle and re-use their waste water for secondary purposes such as gardening.

Also, alternative rain water harvesting methods such as harvesting rain water from concrete surfaces using porous concrete can be used to supplement roof-top rain water harvesting, to help replenish ground water.

Community initiatives to supplement regular water supply: Initiatives such as community water storage and decentralised treatment facilities, including elevated water towers or reservoirs and water ATMs, based on a realistic understanding of the costs involved, can help support the city’s water distribution. Water towers or elevated reservoirs with onsite filters can also help optimise the space available for water distribution in congested cities. Water ATMs, which are automated water dispensing units that can be accessed with a smart card or an app, can ensure metered supply of safe water.

Testing and purification: With water contamination being a big challenge, the adoption of affordable and reliable multi-household water filter systems which are electricity free and easy to use can help, to some extent, access to safe drinking water at a domestic level. Also, the use of household water testing kits and the installation of water quality sensors on pipes, that send out alerts on water contamination, can create awareness of water contamination and drive suitable preventive steps.

Public awareness and use of technology: Public awareness campaigns, tax incentives for water conservation and the use of technology interfaces can also go a long way in addressing the water problem. For example, measures such as water credits can be introduced with tax benefits as incentives for efficient use and recycling of water. Similarly, government water apps, like that of the Municipal Corporation of Greater Mumbai, can be used to spread tips on water saving, report leakage or send updates on water quality.

Collaborative approach: Finally, a collaborative approach like the adoption of a public-private partnership model for water projects can help. There are already examples of best practices here. For example, in Netherlands, water companies are incorporated as private companies, with the local and national governments being majority shareholders. Involving citizens through social business models for decentralised water supply, treatment or storage installations like water ATMs, as also the appointment of water guardians who can report on various aspects of water supply and usage can help in efficient water management. Grass-root level organizations could be partnered with for programmes to spread awareness on water safety and conservation.

For BASF, the proposed solutions are an extension of their close engagement with developing water management and water treatment solutions. The products developed specially for waste and drinking water treatment, such as Zetag® ULTRA and Magnafloc® LT, focus on ensuring sustainability, efficiency and cost effectiveness in the water and sludge treatment process.

BASF is also associated with operations of Reliance Industries’ desalination plant at Jamnagar in Gujarat.The thermal plant is designed to deliver up to 170,000 cubic meters of processed water per day. The use of inge® ultrafiltration technologies allows a continuous delivery of pre-filtered water at a consistent high-quality level, while the dosage of the Sokalan® PM 15 I protects the desalination plant from scaling. This combination of BASF’s expertise minimises the energy footprint of the plant and secures water supply independent of the seasonal fluctuations. To know more about BASF’s range of sustainable solutions and innovative chemical products for the water industry, see here.

This article was produced by the Scroll marketing team on behalf of BASF and not by the Scroll editorial team.