Skip to content

How to use AI: Mauricio Santillana gave an algorithm accurate models to analyze data 

Santillana, director of Northeastern’s Machine Intelligence Group for the Betterment of Health and the Environment, tested models that foretell upsurges of the disease in 180 locations around the world.

Mauricio Santillana shown sitting at a desk and typing on his laptop.
Mauricio Santillana, physics professor and director of Northeastern’s Machine Intelligence Group for the Betterment of Health and the Environment. Photo by Matthew Modoono/Northeastern University

In order to predict when and where deadly dengue fever outbreaks would occur, Mauricio Santillana had to train a new co-worker.

Santillana, a professor of physics and director of Northeastern’s Machine Intelligence Group for the Betterment of Health and the Environment, tested models that foretell upsurges of the disease in 180 locations around the world. But to do this analysis manually would not have provided the timely data Santillana was seeking.

The project, to provide public health officials with an accurate three-month forecast of where and when dengue outbreaks would occur, was an ideal application of machine learning, Santillana says.

“It’s purely teaching the machine to act as an additional team member,” he says. “Now we have enough knowledge that we can tell the machine, ‘Do this for us and experiment massively.’”

Mauricio Santillana shown standing against a wall.
“It’s purely teaching the machine to act as an additional team member,” says Mauricio Santillana. Photo by Matthew Modoono/Northeastern University

What Santillana and his colleagues told the machine was to understand what an accurate prediction looked like. They gave the algorithm accurate models from a period of a few years and asked it to analyze that data. This is what is called the training period.

“We told the machine, ‘We would like you to produce something of this quality,’” he says. “We’re going to give you a collection of methods that you can test and, of those, choose the one that delivers what we need.”

This training and testing had to be done for each regional location, he says. Adding to the complexity of the task, different regions have different ways of reporting dengue cases. Some countries don’t even have reliable funding for tests for diagnosis confirmations.

For a human to process these diverse streams of data would have been “a nightmare,” Santillana says. But with machine learning on his team, Santillana could give the algorithm a task and it would be done in a few hours.

“It would take a human years of processing time,” he says. “But with the computer it happens overnight.”

Once the algorithm had correctly identified the most accurate forecast models from past outbreaks, Santillana’s team asked it to determine which model, or combination of models, was likely to be the most accurate for the following three months. The computational model they used is called an ensemble method.

The next step was to ask the machine to make real-time predictions of when and where dengue fever would break out, what is called a prospective study. The result was accurate 80% of the time.

“That’s the only way that you can build something that is trustworthy,” Santillana says.