Using Artificial Intelligence to Forecast COVID-19
Despite efforts throughout the United States last spring to suppress the spread of the novel coronavirus, states across the country have experienced spikes in the past several weeks. The number of confirmed COVID-19 cases in the nation has climbed to more than 3.5 million since the start of the pandemic.
Public officials in many states, including California, have now started to roll back the reopening process to help curb the spread of the virus. Eventually, state and local policymakers will be faced with deciding for a second time when and how to reopen their communities. A pair of researchers in UC Santa Barbara’s College of Engineering, Xifeng Yan and Yu-Xiang Wang, have developed a novel forecasting model, inspired by artificial intelligence (AI) techniques, to provide timely information at a more localized level that officials and anyone in the public can use in their decision-making processes.
“We are all overwhelmed by the data, most of which is provided at national and state levels,” said Yan, an associate professor who holds the Venkatesh Narayanamurti Chair in Computer Science. “Parents are more interested in what is happening in their school district and if it’s safe for their kids to go to school in the fall. However, there are very few websites providing that information. We aim to provide forecasting and explanations at a localized level with data that is more useful for residents and decision makers.”
The forecasting project, “Interventional COVID-19 Response Forecasting in Local Communities Using Neural Domain Adaption Models,“ received a Rapid Response Research (RAPID) grant for nearly $200,000 from the National Science Foundation (NSF).
“The challenges of making sense of messy data are precisely the type of problems that we deal with every day as computer scientists working in AI and machine learning,” said Wang, an assistant professor of computer science and holder of the Eugene Aas Chair. “We are compelled to lend our expertise to help communities make informed decisions.”
Yan and Wang developed an innovative forecasting algorithm based on a deep learning model called Transformer. The model is driven by an attention mechanism that intuitively learns how to forecast by learning what time period in the past to look at and what data is the most important and relevant.
“If we are trying to forecast for a specific region, like Santa Barbara County, our algorithm compares the growth curves of COVID-19 cases across different regions over a period of time to determine the most-similar regions. It then weighs these regions to forecast cases in the target region,” explained Yan.
In addition to COVID-19 data, the algorithm also draws information from the U.S. Census to factor in hyper-local details when calibrating the forecast for a local community.
“The census data is very informative because it implicitly captures the culture, lifestyle, demographics and types of businesses in each local community,” said Wang. “When you combine that with COVID-19 data available by region, it helps us transfer the knowledge learned from one region to another, which will be useful for communities that want data on the effectiveness of interventions in order to make informed decisions.”
The researchers’ models showed that, during the recent spike, Santa Barbara County experienced spread similar to what Mecklenburg, Wake, and Durham counties in North Carolina saw in late March and early April. Using those counties to forecast future cases in Santa Barbara County, the researchers’ attention-based model outperformed the most commonly used epidemiological models: the SIR (susceptible, infected, recovered) model, which describes the flow of individuals through three mutually exclusive stages; and the autoregressive model, which makes predictions based solely on a series of data points displayed over time. The AI-based model had a mean absolute percentage error (MAPE) of 0.030, compared with 0.11 for the SIR model and 0.072 with autoregression. The MAPE is a common measure of prediction accuracy in statistics.
Yan and Wang say their model forecasts more accurately because it eliminates key weaknesses associated with current models. Census data provides fine-grained details missing in existing simulation models, while the attention mechanism leverages the substantial amounts of data now available publicly.
“Humans, even trained professionals, are not able to process the massive data as effectively as computer algorithms,” said Wang. “Our research provides tools for automatically extracting useful information from the data to simplify the picture, rather than making it more complicated.”
The project, conducted in collaboration with Dr. Richard Beswick and Dr. Lynn Fitzgibbons from Cottage Hospital in Santa Barbara, will be presented later this month during the Computing Research Association (CRA) Virtual Conference. Formed in 1972 as a forum for department chairs of computer sciences departments across the country, the CRA’s membership has grown to include more than 200 organizations active in computing research.
Yan and Wang’s research efforts will not stop there. They plan to make their model and forecasts available to the public via a website and to collect enough data to forecast for communities across the country. “We hope to forecast for every community in the country because we believe that when people are well informed with local data, they will make well-informed decisions,” said Yan.
They also hope their algorithm can be used to forecast what could happen if a particular intervention is implemented at a specific time.
“Because our research focuses on more fundamental aspects, the developed tools can be applied to a variety of factors,” added Yan. “Hopefully, the next time we are in such a situation, we will be better equipped to make the right decisions at the right time.”