Can Machine Learning Calculate Unreported COVID-19 Cases

Can Machine Learning Calculate Unreported COVID-19 Cases

How can unidentified COVID-19 cases be tracked?

Researchers and provider organisations have increasingly embraced artificial intelligence (AI) and machine learning (ML) tools to reduce and track the spread of COVID-19 and to improve their surveillance efforts.

Big data analytics systems have helped health experts to stay ahead of the pandemic from predicting patient outcomes to anticipating future hotspots, resulting in more efficient care delivery.

However, the level of pandemic preparation by healthcare organisations is only as good as the data available to them. Although the industry is well aware of the data issues, the COVID-19 pandemic has brought a host of unique challenges to the forefront of care delivery.

Nature of the SARS-CoV-2 has led to significant gaps in COVID-19 data with inconsistencies in information, leaving officials uncertain of the effectiveness of public health interventions.

"Asymptomatic infections are a common phenomenon in the spread of coronavirus", said Lucy Li, PhD, a data scientist at the Chan Zuckerberg Biohub. "And it's very important to understand that phenomenon because depending on how many asymptomatic infections there are, public health interventions might be different."

Chan Zuckerberg Biohub's researchers are working to cope up with this situation. Li estimated the number of undetected infections using machine learning and cloud computing at 12 locations including Asia, Europe, and the U.S over the course of the pandemic. The results showed that a vast range of infections remained undetected in these parts of the world with the rate of unidentified cases as high as over 90% in Shanghai.

Additionally, when the virus was first contracted in these 12 locations, more than 98% of cases were not reported during the first few weeks of the outbreak. This indicates that the pandemic was already well underway by the time intensive testing began.

Such findings have crucial implications on public health policy and provider organisations, Lucy Li noted.

"For disease outbreaks where you can identify every single infection, rapid testing and a tiny amount of contact tracing is enough to get the epidemic under control, stated Li. "But for coronavirus, there are so many asymptomatic cases out there and testing alone will not help control the pandemic."

"It is because usually when you do testing, you are testing only symptomatic patients which are a subset of the total number of infections out there," explains Li. "You're missing a lot of people who are spreading the infection without their knowledge, hence they are not quarantined. Being able to sense of what that number might be is helpful for allocating resources."

Li's research was backed by AWS Diagnostic Development Initiative which has initiated a global effort to stimulate diagnostic research and innovation during the coronavirus pandemic and to mitigate future disease outbreaks.

The data Li is using is viral genomes, the viral DNA. She elaborates, "As the viral genomes spread through the population, they accumulate mutations. These mutations are generally not good or bad; they're just changes in the genome." She added, "Every time the virus infects a new individual, it could accumulate new mutations. So, if we know how fast the virus mutates, we can infer how many missing transmission links there were in between the observed genomes."

Li said, "Many different scenarios could explain what we see in the viral genomes. I have to leverage machine learning and cloud computing to test all of those hypotheses and to see which one can explain the observed changes in viral genomes."

She pointed out that these data analytics are well-suited to meet the challenges brought by COVID-19.

ML tools allow the researchers to explore different explanations of the data they see so that they can test many hypotheses. With ML and cloud computing technologies, streamlining a previous time-consuming task is possible.

By having access to more computational resources in the cloud, time can be reduced from months to days because of the more memory leveraging capacity, which better parallelises analysis.

This research may help health officials to monitor the rate of under-reporting in real-time that could indicate how well current surveillance systems are operating. With the available data of COVID-19 pandemic, analytics tools are essential for bringing new insights and potential solutions.

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net