Peeping into the black box of AI to discover how collective behaviors emerge
How do the stunningly intricate patterns created by schools of fish emerge? For many scientists, this question presents an irresistible mathematical puzzle involving a substantial number of variables describing the relative speed and position of each individual fish and its many neighbors.
Various mathematical models were proposed to tackle this question, but according to Gonzalo de Polavieja, head of the Collective Behaviour lab at the Champalimaud Centre for the Unknown in Lisbon, Portugal, they would inevitably fall into one of two extremes: they would either be too simple, or too complex.
"The rise of the field of artificial intelligence and machine learning has provided models that are very accurate in predicting the behavior of individuals in groups," says de Polavieja. "But these models are like black boxes: The way they process the data to generate their predictions could involve thousands of parameters, many of which may not even correspond to real-world variables. Humans are unable to make sense of such complex information."
"On the other extreme," he continues, "are the simpler models, with few parameters, that allow you to identify rules associated with one main component, such as the distance between the fish, or their relative velocity. But those models are too narrow and therefore are never accurate when it comes to predicting the overall behavior of the group."
Drawing inspiration from a new type of an AI model called "attention networks," de Polavieja and his team were able to identify a solution that lies just between the two extremes: a model that is both insightful and predictive. They describe their results in an article published in the scientific journal Plos Computational Biology.
Deconstructing the black box
To solve the problem, the team decided to use AI techniques with a twist: instead of constructing the standard intact "black box," they organized the model into numerous interconnected modules, each of which was simple enough so that it could be analyzed.
When the team studied the functions generated by the individual modules, they found that the coarse rules they already knew still held, but were greatly refined. "For example, according to previous models, the space around each fish is divided into three circular concentric areas: repulsion, alignment, and attraction. We also found those same three areas, but contrary to the simple models that originally identified them, our model showed that the areas were not circular, nor concentric, and that they changed in a manner that depended on the velocity of the fish," explains Francisco Heras, the first author of the study.
In addition to being insightful, the model is also good at predicting the behavior of the fish. "We can tell with 90 percent accuracy whether each fish in the group will turn right or left during the following second," says Heras. "This may not seem like a long time compared with the timescale humans operate in, but zebrafish live in a faster paced environment and can move a distance of about eight times their body length in a mere second."
The results of the model are so robust that one can't help but wonder why this approach wasn't used before. According to de Polavieja, the answer is "a bit of sociology and a bit of mathematics." As he explains, "since the two approaches dominating the field were so different, it took a while to realize that constructing a model that is both insightful and predictable was even possible." Once the team realized this possibility, they began exploring different architectures and fine-tuning their set of assumptions in a way that optimized the predictive capacity of the model while keeping it simple enough to be insightful.
Another element that made this development possible is the open-source, sophisticated tracking software the lab had recently developed. "By using idtracker.ai, we were able to track groups of 100 fish simultaneously. This was crucial for obtaining the large and detailed dataset necessary for this type of research."
The team made the code for their model freely available. According to Polavieja, it can be a useful tool for the collective behavior community, which will now have a way to recover interaction rules in a way that is automatic, predictive and insightful of the biology. "We hope that it will be used by others to study many different types of social interactions," he concludes.