Reaching an optimal shared decision by applying only decentralised algorithms is a key aspect of many multi-agent and swarm robotic applications. As humans, we often have to come to some conclusions about the current state of the world so that we can make informed decisions based on these conclusions and then act in a way that will achieve some desired state of the world, given what we believe to be the current state. Should our belief about the current state be wrong, then we could be choosing actions that would likely lead to unintended consequences. Of course, expecting every person to have perfect, up-to-date knowledge about the current state of the world is unrealistic, and so we often rely on the beliefs and experiences of others to inform our own beliefs.
We see this too in nature, where honey bees must choose between a large number of potential nesting sites in order to select the best one. When a current hive grows too large the majority of bees must choose a new site to relocate to via a process called “swarming” – a problem that can be generalised to choosing the best of a given number of choices; referred to as the “best-of-n” problem. To do this, bees rely on a combination of their own experiences and the experiences of others in the hive in order to reach an agreement about which is the best site. We can learn from solutions found in nature to develop our own models and apply these to swarms of robots so that, by having pairs of robots interact and reach agreements at an individual level, we can distributed the decision-making process across the entire swarm.
Decentralised algorithms such as these are often considered to be more robust than their centralised counterparts, but this is rarely put to the test. However, robustness is crucial where robots typically possess limited hardware capabilities in order to maintain cost-effectiveness at such large numbers, as well as in scenarios which might be critical to the protection or preservation of life such as in search and rescue operations. In this context we aim to introduce an alternative model for distributed decision-making in large robot swarms and examine its robustness to the presence of malfunctioning robots, as well as compare it to an existing model: the Weighted Voter Model (WVM) (Valentini et al., 2014).
The best-of-n problem
In this work we study the best-of-n problem where n = 2 and compare our three-valued model with a simplified version of the WVM. In this version of the WVM, robots (specifically Kilobots) move around randomly in a 1.2m^2 arena and at any point in time are in one of two primary states: either signalling, or updating. Those in the signalling state are signalling their current belief that either “choice A is the best” or “choice B is the best”, and they continue to do so for a length of time proportional to the quality of the choice i.e. in our experiments, choice A has a quality of 9 and choice B has a quality of 7, and so those believing that choice A is the best choice will signal for a longer duration than those believing that choice B is the best. This naturally creates a bias in the swarm where those signalling for choice A will do so for longer than those signalling for choice B, and this will inevitably affect the updating robots. Those in the updating state will select a signalling robot at random from their local neighbours, provided that they are within their communication radius (10 cm limit for the Kilobots), and adopt that robot’s belief.
In the figure, we show results as a trajectory of the Kilobots signalling for either choice A or choice B.
In this work we have proposed a three-valued model for consensus in multi-agent systems and swarm robotics. We have compared this three-valued model to the weighted voter model, focussing on both speed of convergence and robustness to individual malfunction or error. Experiments were conducted using both a simulation environment and physical Kilobot swarms of 400 robots. The results from both sets of experiments confirm that the three-valued model is more robust to the presence of malfunctioning or noisy individuals than the WVM, but that the WVM is quicker to converge to the best option. Future research would consider ways to close the gap between the three-valued model and the WVM in terms of speed of convergence while maintaining improved performance to the presence of malfunction/error. We also intend to consider different distributed algorithms for decision-making which take account of numerous beliefs being signalled within a robot’s radius of communication. So far, both models considered in this work only consider a single robot’s belief while updating but there exist other models, such as majority rule models and models for opinion-pooling, which take account of the beliefs of many robots.
Decision-making in robot swarms has been explored by other researchers in distributed robotics, including [James Marshall], [Marco Dorigo], [Radhika Nagpal]. More specifically, our research builds on the works of [Gabriele Valentini], whose weighted voter model is compared to our proposed three-valued model, and [Andreagiovanni Reina] whose work includes a third truth state while the underlying model remains very different to our own.
Michael Crosscombe is an Engineering Mathematics PhD student at the University of Bristol. Follow him on twitter here .