Learningwith Communication Barriers Due to Overconfidence . What " Model-to-Model Analysis " Can Add to the Understanding of a Problem

In this paper, we describe a process of validation for an already published model, which relies on the M2M paradigm of work. The initial model showed that over-confident agents, who refuse to communicate with other agents whose beliefs di er from theirs, disturb collective learning within a population. We produce a simplified model that we analyze using probabilistic methods, and which enables us to better explain the process that operates in our firstmodel, and demonstrates that this process is indeed converging. Tomake sure that the convergence time is meaningful in the context we consider (not just for an infinite number of agents living for an infinite time), we use the analytical model to produce very simple simulations and assess that the result holds in finite contexts.


Introduction
. In this paper, we present research where analytical results obtained in a simplified model are used to enhance our understanding of an already existing social simulation model.This shows that two techniques, agent-based modeling and probabilistic analysis, can be used jointly in some settings to provide further insights about a conceptual model.Here the original model was built to attain a theoretical understanding of abstract stylized facts.
In the present paper, we will explain the di erences between the modeling approaches, and how the second model, although much simpler, draws attention to the role of an important parameter we had not tested in the first version of the simulation model.We believe that this comparison between models can be of interest to the model-to-model research community.Indeed, we show that an analytical model can explain some properties of a simulation model and that it can also identify properties of the model that have an important impact on the results.
. The first simulation model (we will refer to it as 'RT ') enabled us to show that the presence of over-confident agents in a population can slow down collective learning (Rouchier & Tanimura ).The application we had in mind concerned the coordination of agents using a resource collectively.We consider a context where the di usion of knowledge is important to reduce ine iciency and where knowledge di usion depends on the success of the agents, which in turn depends on the quality of their belief.A correct belief is close to the reality of the environment and an incorrect belief is wrong in many dimensions.In the analytical model ('AM') we represent the accuracy of the beliefs in a much simpler way.Thus it is not necessarily the case that all the results obtained in this setting carry over to the simulation model.

.
In this new paper, we present RT briefly but readers should refer to Rouchier & Tanimura ( ) for details.We then develop the analytical model which is shown to capture the main features of the original model but, as .
To assess the quality of the simulation results, whether they are already published or not, replication is the most widespread technique (although it is still possible to consider that it is not used enough (Wilensky & Rand )).Replication enables us to criticize the imperfect quality of description of the model or results in a paper (Rouchier ), whose incompleteness makes it impossible to reproduce the dynamics of the formally described model.It can also, in a more fundamental way, show that some exposed results just do not hold, like Edmonds & Hales ( ).They have also been used to extend model validity by focusing on parameters di erent from those in the original papers, as one has frequently done in the case of opinion dynamics (i.e., Urbig et al. )).Failed replication and the response by the original modeler can also give rise to interesting and fundamental methodological discussions, like the "De uant discussion" (Meadows & Cli ; De uant et al. ) (which reminds us that a simulator has to be aware of the importance of time to be able to draw conclusions), or the "Macy discussion" (Will & Hegselmann ; Macy & Sato ; Will ) (which reminds us all that devil is in the details). .
A more di icult research agenda is the one that proposes to identify classes of models which display similar dynamics (Cio i-Revilla & Gotts ).This fundamental approach has the ambition to slowly build a theory of complex dynamics through the identification of common processes among seemingly di erent models that have common characteristics, although, at first sight, their fields of application are very di erent.To sustain this research trend (and also to solve the problem of incomplete description of models), a description protocol, the ODD protocol (Polhill et al. ), has been proposed and has since gained recognition and been widely used in various applications.This protocol also enables us to compare structurally di erent models that deal with the same type of target system but with di erent modeling choices (Polhill et al. ).In general, exact explicitness, transparency and the open access to simulation models have been considered the most important basis for sharing scientific knowledge produced by simulation (see openabm.com and Janssen et al. ( )).
. Sometimes, research on the validation of generally identifiable social dynamics has also relied on the link between agent-based models and what can be described under the generic term of analytical models.The idea is to show that results that are observed through simulations can be proven to hold in a given set of situations, often described in more abstract terms (like in our case, which will be described later, with a population of agents whose size goes to infinity and who interact over a number of periods that also goes to infinity).As seen in the previously cited discussion about the De uant model, the convergence time can be extremely long, and the risk is to stop a simulation too early, believing that a steady state has been reached, whereas a new state could be attained if time was, for example, doubled.Maybe one answer to this very general problem could be provided by Grazzini ( ) who discusses, for cases where it is impossible to explicitly write the equations that regulate the evolution of the system, how tools from non parametric statistics can be used to detect whether time series generated by agent based models are stationary and stem from an ergodic process.

.
The issue of connections between models has been apprehended in di erent ways.The most straightforward approach is to compare di erent existing models and to look at the similarity of their results, like Klüver & Stoica ( ) studying behaviors in a network and showing that results are similar.In the same spirit, Edwards et al.

(
) demonstrate that an aggregate and a distributed model of opinion di usion converge only when there is just one attractor but diverge if there are two.Some authors decided to go from an already established simple model which can be treated analytically, and to relax the hypotheses to check if the results remain similar.This is the type of result obtained by Vila ( ), who starts out with a deterministic Bertrand competition model, and then adds assumptions about agents' behavior using a genetic algorithm -in the end, he obtains similar results with both tools, showing that loyalty should emerge in this competitive setting.An integration of statistical and agent-based models is achieved by Silverman et al. ( ), who thus succeeds in producing predictions and explanations of demographic phenomena in parallel.

.
It is also possible to make the link between ABM and analytical work by first running simulations and then creating from the model a simplified generalization, as in Cecconi et al. ( ) who show that for the congestion model they study, it would not have been possible to construct an analytical model without having previously studied simulation dynamics.The complementarity of these approaches is such that insights from the simulations are crucial for developing the analytical model which in turn extends certain aspects of the simulated model.Moreover, the simulations capture fluctuations over time whereas the analytical model focuses on average quantities.This paper is the one that is the closest to the research we present here, especially in the way the modeling phases are articulated.

The simulation model .
The model that has already been published, RT (Rouchier & Tanimura ) (as an evolution of a previous model and analysis (Rouchier & Shiina )) shows that, in a particular setting, overconfident agents can prevent a whole population from learning when learning is social, in the sense that it includes influence among agents.The model is based on simulations that take place in a universe whose "objective reality", as well as the representation that agents have of this environment, take the shape of a culture vector as in Axelrod's work ( ).The properties of our environment, as well as the agents' beliefs, are hence a string of bits taking or as value.Agents are boundedly rational: they have to act on this environment, and this action is also the only way for them to acquire information about its characteristics: when they succeed in their action, they can infer that their representation is correct, and when they fail they know they have some incorrect beliefs.The accuracy of individual beliefs is defined as the fraction of "correct bits"; the accuracy of the collective belief is defined as the average accuracy.The actions undertaken by the Agents do not have any impact on the characteristics of the environment which stay unchanged in the time-scale of the simulation.

.
We also add an influence mechanism to this simple feed-back learning, namely that agents have to act by pair, and thus choose who will "lead the action" and define the right representation used to choose the action.Since we want to test the influence of heterogeneity of self-confidence, we build our agents accordingly; heterogenous in this dimension, as well as in their representation of the environment (randomly drawn at initialization).
. The way a time-step takes place is such that: . randomly paired, agents first decide which belief to use for organizing their action: the agents with highest confidence is the one who leads the action and uses his belief .according to the accuracy of the belief (its adequacy to real characteristics), the agents are more or less successful in their action: note that we do not actually model the action, but a probability of success -the probability of success is linear with respect to the accuracy of the belief of the leading agent.
. when an action is successful, the agent who led the action influences the other one by transforming his belief (only one bit is transformed at a time).
. The main result of our study is that the presence of some very confident agents, who on average overestimate their knowledge and cannot be influenced by others, slows down, and can even completely stop, the learning of the society as a whole.One can identify thresholds in the number of overconfident agents, that produce different patterns in the simulations.In particular, results depend on the probability of meetings between agents that can influence each other.
. Two criticisms can be directed at this first model.First, it is rather complex: the "Axelrod-type setting" gives rise to dynamics that are hard to anticipate, as he notes himself.Perhaps due to its complexity, this type of opinion model has not produced as vast a literature, as, for example, the De uant influence model has.Hence, the validity of our results cannot be checked in coherence with a large number of related dynamics that could have been documented otherwise.
. We thus had to check if our result was robust, and to understand some of the underlying mechanisms behind it.We decided to study an analytical model that would share the main features of our model, and test the stability asymptotically with respect to time and population size.The three features -that learning takes place through influence, that accurate knowledge increases the ability to influence, and that communication can be restricted -have hence been included in a much simpler model for which it is possible to identify steady states of the system and characterize the conditions under which they occur.

A Simplified Model of Belief Evolution
Defining the model .In this section, we study analytically a simplified version of belief evolution in the presence of confident agents.
Admittedly some of the richness and complexity of the initial model is lost due to the simplifications.However studying the simplified model should allow us to shed some light on the mechanisms that operate in the previous one.To make the model tractable, we retain only the most basic features of our situation: there are several possible beliefs which are more or less correct.Beliefs that are more correct increase the probability of successful actions and thus of convincing ones peers.Agents' confidence in their own beliefs prevent them from engaging in joint actions with others whose beliefs are too di erent.
. The main simplification will concern the belief environment.Beliefs about the environment take values b ∈ {0, 1, ..., K}.We will assume that the true state of the world is K, which means that the di erence K − b measures the degree of error in the agents belief compared to the true state of the world.An agent whose belief is K knows the true state of the world, an agent whose belief is is maximally mistaken about the true state of the world.This belief space is one dimensional, as opposed to the multi-dimensional one in the general model.The belief environment presented here is not a special case of that in Rouchier & Tanimura ( )).The di erence is not just a quantitative reduction in complexity but a qualitative di erence.Now agents whose level of error is the same necessarily hold the same beliefs which was not necessarily the case in the more complex belief environment, agents could be "equally wrong " if they had the same number of incorrect bits, but not necessarily mistaken in the same way if their errors concerned di erent bits.

.
Let us now define what is meant by the confidence of an agent in this setting.An agent is (over)confident about his own belief if he refuses to be influenced by others whose beliefs are too di erent from his own.For any two agents, we can define the distance between their beliefs.An agent's confidence is equal to K minus the maximal distance in beliefs for which he accepts to engage in an action with (and thus possibly be influenced by) another agent.Thus, the higher an agent's confidence, the lower his tolerance of (or at least his willingness to be influenced by) those whose beliefs are di erent from his own.An agent with confidence C will only interact with other agents whose beliefs are at most at distance K − C from his own.For example, if an agent has confidence , it means that he is willing to interact with anyone whose beliefs are at most K steps away from his own.Since K is the highest possible distance in beliefs, in other words he interacts with everybody regardless of their beliefs.If an agent has confidence K he only interacts with those whose beliefs are at distance from his own beliefs, in other words he will never change his mind.Consider an example where K = 7. Suppose that an agent has confidence and that his own belief is .An agent of confidence will only interact with those whose beliefs are at distance at most 7 − 5 = 2 from his own, i.e. he will interact with agents whose beliefs are 2, 3, 4, 5, 6.We note that it is not relevant to consider levels of confidence higher than K since the agent with confidence K is already maximally intolerant.
. Then the simplest non-trivial case that we can analyze is when K ≥ 2. Indeed, if K = 1, then if the level of confidence is 1, no agents with di erent beliefs interact and if the confidence is 0, we are in the trivial case where everyone interacts.In the case K = 2, the possible non trivial levels of confidence are 0, in which case the agent is willing to be influenced by anybody and 1, in which case the agents with beliefs 0 and 2 do not communicate.
We will analyze the long run state of the system, depending on the confidence levels of the agents.We note that the case K = 2 is particularly simple for the following reason: agents of confidence 1 form an isolated system in the sense that they are not influenced by the agents with confidence 0, although they influence these agents.Therefore, we will analyze a system with only agents with confidence 1.If there are agents of confidence 1 who remain in state 0, it is obvious that the agents of confidence 0 cannot a ect the beliefs of these agents but that they will be unilaterally a ected by the former.Therefore, to show that the social learning process does not converge towards perfect knowledge, it is su icient to show that this is the case when we restrict our attention to the agents with confidence 1.
. The analytical model, as well as the simulated one, is non ergodic, meaning that the initial conditions and the realizations of random variables during the dynamics will have an impact on the long run state.The choice of non ergodicity is voluntary.Many social phenomena indeed depend in a crucial way on initial conditions.

Figure :
There is no influence between type 0 and type 2 agents.Influence is more important from type 1 to type 0 than the opposite, since influence increases with the accuracy of knowledge.This holds for influence between type 1 and type 2 agents.
. Given that the long run outcome that is reached -represented by an absorbing state -depends on initial conditions and random realizations, we will restrict our attention to the case where the number of agents is large.In this case we can make some statements about long run behavior that will hold with high probability.In particular, we show on the one hand that the presence of confident agents qualitatively alter the long run dynamics.Without confident agents, all agents will learn the true state with high probability.This merely captures the intuition that since informed agents are more convincing, others will adopt their beliefs rather than the other way around.In the presence of confident agents, we show however, that with high probability, a non zero fraction of the agents will continue to hold incorrect beliefs.From a quantitative point of view, whether or not this fraction is large depends on a number of model parameters including the initial proportions of beliefs, whose role we will also establish.
. We will consider a system in which all agents have confidence 1. Updating is asynchronous and interactions appear through random mixing.We note that the constraint imposed by the confidence implies that agents with beliefs 0 and 2 never influence each other even if they meet (see Figure ).Therefore any encounter that modifies beliefs necessarily involves an agent with belief 1.

Model and notation .
Let N be the number of agents.Because we are interested in asymptotics with respect to population size we want to be able to compare the states of systems of di erent size.To this e ect, we define a state as an The triplet (n 0 , n 1 , n 2 ) represents the fractions of the population of agents who have beliefs 0, 1 and 2 respectively, which we refer to as "level ", "level ", and "level " agents.
In a population of size N , the number of level i agents is then given by N n i .We will only consider values of N for which N n i is a natural number for i = 1, 2, 3. Note that N n i = N i (0), the initial number of agents in level i.
. We now need to describe how the beliefs of the agents evolve over time due to the interactions.We thus introduce random variables whose evolution can be identified with that of the belief evolution as long as the latter has not reached a steady state.
. For i = 0, 2, we define the quantities N i (t) in the following way: Since there is a total of N agents, we deduce that the number of agents who hold belief 1 is given by .
The factors that determine the belief evolution are as follows: The variables (Z s ) s≥1 can be seen as determining whether the agent with belief 1 encounters an agent with belief 2 (Z s = 1) , 0 (Z s = −1), or an agent who also has belief 1 (Z s = 0) at time s ≥ 1.The probability of encountering an agent with belief 2 or 0 depends on the proportion of agents in the population: When two agents with belief meet, nobody changes beliefs.
. Suppose that Z s = 1, or in other words that an encounter occurs between an agent with belief 1 and an agent with belief 2. There are three possible outcomes of this encounter which we model by a random variable A s that takes the values 1, 0, −1 with probability p 2 , 1 − p 2 − p 1 , p 1 .The case A s = 1: the agent with belief 1 is persuaded by the agent with belief 2 and changes his belief to 2, thus becoming a level agent, the case A s = −1: the agent with belief 2 is persuaded by the agent with belief 1 and changes his belief to 1, and finally the case A s = 0: neither agent manages to convince the other one and no one changes his belief.Similarly, when Z s = −1, an encounter occurs between a level 1 agent and a level 0 agent.The three possible outcomes are determined by a random variable B s which takes the values 1, 0, −1 with probability p 1 , 1 − p 1 − p 0 , p 0 .Similarly, if B s = 1, the level 0 agent changes his belief to 1 , if B S = −1, the level 1 agent changes his belief to .
The probability p i is equal to the probability that the agent at level i influences his partner times the probability that he is chosen to be the leading agent in the interaction (this probability is always 1/2 since all the agents have the same confidence).
. Since we want to capture the fact that better knowledge of the environment makes an agent more convincing, we will require that p 2 > p 1 > p 0 , or equivalently that E(A) = p 2 − p 1 > 0 and E(B) = p 1 − p 0 > 0. This corresponds to the fact that the expected evolution of beliefs favors a transition towards better knowledge.
. The evolution of beliefs occurs at each step in time and is modeled with the two sequences of variables (A s ) s≥1 , and (B s ) s≥1 .The variables in the sequence (A s ) s≥1 are independent and identically distributed.The variables in the sequence (B s ) s≥1 are also independent and identically distributed (the laws of the variables in the two sequences are not necessarily the same).Moreover, the variables in the sequence (A s ) s≥1 are independent from those in the sequence (B s ) s≥1 .
. We note that the system (N 0 (t), N 2 (t)) which evolves as a function of the random variables (Z s ) s , and (A s ) s≥1 , (B s ) s≥1 is well defined without reference to N 1 (t) but it only coincides with that of the belief evolution as long as N 1 (t) > 0.

.
Having defined above the quantities (N 0 (t), N 1 (t), N 2 (t)) which represent the number of agents with belief 0, 1 and 2, we will be interested in the steady states that this system can reach.A steady state is a state which is permanent in the sense that the beliefs will not evolve any more.There are four possible steady states: N i (t) = N for some i = 0, 1, 2. This corresponds to a situation where all agents hold identical beliefs i.The fourth absorbing state is that where there are no longer any agents with belief .In this case, communication barriers will prevent influence between the agents with belief 0 and 2. This absorbing state occurs when N 1 (t) = N − N 0 (t) − N 2 (t) = 0.An absorbing state of this type can be written as (N 0 (∞), 0, N 2 (∞)) such that N 0 (∞) + N 2 (∞) = 1.We note there cannot be any other type of steady states because if there are still agents who have belief and at least one agent whose belief is di erent from , an encounter can occur and with positive probability the agent whose belief is not changes his belief to in contradiction with the assumption that we were in a steady state. .
As said before we assume that better knowledge of the environment translates into greater ability to convince.
In other words the expected evolution of beliefs makes it more likely to move from to and from to than in the reverse direction.In other words E(A) > 0 and E(B) > 0. For this reason, it is easy to see that when the number of agents N is large, absorbing states will be either N 2 (∞) = N , that is, all agents have learned the true state of the world, or states of the last type where N 1 (∞) = 0.The other two states require flows that are in contradiction with expected behavior and are untypical in large populations.
. We will begin by showing that without the presence of communication barriers due to the presence of confident agents, with high probability, all the agents in a su iciently large population will eventually learn the correct state of the environment provided that we start from an initial condition where a positive fraction of agents know the true state of the world (proof in appendix).This case can be seen as a benchmark with which we can compare the results we obtain when we introduce confident agents in the influence process.We will show that the presence of confident agents leads to a qualitative di erence in the long run outcome.Now, starting from any initial condition where each belief is held by a positive fraction of the population, we reach a steady state where a significant proportion of the agents are permanently stuck with their incorrect beliefs.
Proposition .For any initial condition such that n 0 > 0, n 2 > 0, and for any δ > 0, there exists N 0 such that if N > N 0 , the probability is at least . Moreover, we can analyze how di erent model parameters influence the fraction of agents who remain in a state of low knowledge: Proposition .N 0 (∞)/N the asymptotic fraction of agents that remain in level is: • increasing in n 0 (assuming that an increase in n 0 is compensated by a decrease in n 1 , holding n 2 constant) • increasing in n 2 (assuming that an increase in n 2 is compensated by a decrease in n 1 , holding n 0 constant) .
The proof of this proposition can be found in the section on comparative statics at the end of the paper.
. The lower bound in the proposition allows us to identify factors that lead to ine icient collective learning.On one hand, we can see that if E(A) << E(B), the number of agents who are asymptotically in a state of low knowledge is close to n 0 .In other words, most of the agents who were initially in a state of low knowledge will remain in a state of low knowledge.The lower bound also depends on the initial condition (n 0 , n 1 , n 2 ).
We note that a higher initial value of knowledge does not necessarily lead to better long run outcomes.For example, (n 0 , n 1 − t, n 2 + t) can lead to worse long run learning than (n 0 , n 1 , n 2 ).

Main intuitions behind results
.
The mechanism that accounts for the described outcome is quite intuitive.Absorbing states where N 0 (∞) > 0 occur because there are no longer any agents with belief 1 who ensure the communication.The fact that a positive fraction of agents remain in level is explained by the encounter probabilities alone.Since E(B) > 0, there is an expected flow out from level .Since encounters occur uniformly at random, eventually the population in level is very small in size whereas that in level has grown outstandingly.The probability of encountering the remaining agents in level is very low.However, there are other important factors that determine whether a larger, non negligible fraction of the population will remain in the low state of knowledge.It is only in this case that we can really say that social learning is ine icient.When E(A) is large compared to E(B) it means that typically the net rate of agents who move from belief to belief is greater than the net rate of moves from to .If the number of agents initially in belief is not too large compared to the number of agents in level , it is likely that all agents with belief move to belief before most of the agents in belief move to belief .Clearly, the long run outcome also depends on the relative sizes of the populations with beliefs , and .However, the most interesting observation is probably that in the simplified belief environment the fact that agents with correct beliefs are much more convincing does not improve social learning in the population as a whole.Table : Average number of non-knowledgeable agents at the end of a simulation (average over simulation runs), starting with agents at each level at initialization.P 0 = 10, P 1 < P 2. The higher P 1 the better the learning; the higher P 2 the worse the learning, which is counter-intuitive.

A simulation model based on the analytical model .
The analytical results we provide are asymptotic, meaning that they are valid when the number of agents and time-steps tend to infinity.It is natural to make such assumptions to obtain analytical results, but it is not necessarily the time scale that is relevant in relation to the socio-environmental dynamics we are dealing with in our story.Hence, we copied the logic of the analytical model into a new simulation model, which greatly simplifies the previous one.The model was written with NetLogo and can be found with a short description at: https://www.openabm.org/model//version/ /view.
. At initialization, agents are created and attributed a level of knowledge (0,1 or 2).Whenever an encounter between two agents occurs, each of the agents is drawn to be the leader with probability 1/2.The probability (expressed in what follows as a percentage) that the leader influences the follower is then respectively P 0, P 1, and P 2 for agents of knowledge level 0, 1 and 2. We have P 0 2 = p 0 , P 1 2 = p 1 , P2 2 = p 2 , since the transition probabilities defined in the previous sections are given by the probability that each agent is chosen to be the influencer in the encounter times the probability that he actually influences his partner.

.
The value of the probability to influence is initialized for each level of knowledge, with the constraint that P 0 < P 1 < P 2, since the most knowledgeable agents are those who influence most as described in both previous models.At each time-step, a level 1 agent is chosen and another agent is picked among level 0 and level 2 agents.One of these two agents is chosen randomly as the leader and we determine, using the associated probability, whether he will influence the other agent.If so, the level of knowledge of the influenced agent changes and becomes the same as that of the leader.The simulation stops when no agent of level 1 remains.
. In our dynamics, we do not consider meetings between agents that cannot result in a change of opinion for the agents involved, and since we are in a setting with agents who are self-assured (with communication barrier), a level 1 agent is always chosen first.
. What we can expect from the simulation, if we assume that it will behave like the analytical model although we now consider a finite population size, is that: . We keep many agents in the bad knowledge situation when (P 1 − P 0) < (P 2 − P 1) . If we fix the probabilities, it is better to have the initial N 2 (0) not too high compared to N 1 (0), so as to achieve complete learning
We first ran simulations with agents at each level, varying the probability to influence.The results can be observed in figures to .In the first setting, both di erences between probabilities are equal, and thus level 0 agents who do not learn are numerous (fig.).In the case when P 2 − P 1 = 8 and P 1 − P 0 = 52, all level 0 agents get to level 2 in almost all simulations (fig.); whereas if the di erence of probability is the same, with value , there are still many level 0 agents at the end of the simulation (fig.).Of course, if P 2 − P 1 > P 1 − P 0 then there are still many level 0 agents at the end (e.g.fig ).Finally, it is only when the di erence (P 1 − P 0) − (P 2 − P 1) is really large that we can bring all agents to learn in most simulations.In all other cases, there are still agents of level 0 at the end (e.g. fig. , fig. and fig. ).These results are summarized in table , where an increase in P 2 holding other probabilities constant reduces global learning, and an increase in P 1 holding other probabilities constant increases global learning.What we can conclude from this set of simulations, is that our first expectation is realized in the simulation model.

.
Hence we can see that the result holds, and that the most important element is the relative ability to convince of agents at level and .

Figure :
This image shows the final setting of a simulation (in the center), with green agents being level 2 agents and red ones being level 0 agents.The dynamics of the number of agents at each level can be seen on the top right graph, where the yellow curve represents the number of level 1 agents.This case is with a su icient di erence of probability for both influence interaction: during the whole simulation there are enough level 1 agents to bring all level 0 agents to knowledge, and then they also turn to level 2 agents at the end.Table : Average number of non-knowledgeable agents at the end of a simulation (over simulation runs for each value).We keep P 0; P 1; P 2 = 20; 70; 80 which is a situation where learning is good.Increasing the number of N 0 (0) reduces learning, but increasing the number N 2 (0) has an even stronger e ect on the reduction of learning.This is in line with our analytical result but is rather counter-intuitive.

Initial number of agents .
In the second set of simulations, we kept probabilities unchanged, using what was previously a favorable setting with P 0 = 20%, P 1 = 70% and P 2 = 80% and we vary the initial number of each group of agents.We first keep level agents and vary the other values, and then keep the number of level agents at and vary the others, while keeping their sum constantly equal to .
. These results show that the influence of the initial number of agents in each category is also important in a finite population, and follows the rule that has been established in the analytical model, even when the time steps and number of agents are finite.
. Some aspects of these global results are rather counter-intuitive, such as the fact that better transmission of good information leads to worse collective learning.This can be observed in the former table where, for in-Figure : This image shows the final setting of a simulation (in the center), with green agents being level 2 agents and red ones being level 0 agents.The dynamics of the number of agents at each level can be seen on the top right graph, where the yellow curve represents the number of level 1 agents.This case is with the same di erence in probability (P -P =P -P ) : the number of level 1 agents drops too quickly, since they gain more knowledge, and then they can no longer influence the level 0 agents.
creasing values of P 2 we also produce a final number of non-knowledgeable agents which increases.Here, dyadic good learning does not imply collective good learning.

Discussion
. The new analysis we conducted, re-writing the idea of our model with a new methodology, confirms the main result of the previously published work.
. If we define an individual with high confidence as having di iculty learning from those whose beliefs are too di erent from his own, then the presence of overconfident agents, who believe that they are correct when they are not, does have a significant negative impact on the level of collective learning.
. In this context, which gives rise to a communication barrier between agents with di erent beliefs, the result had been shown in a complex environment, through simulation.Here we show that it holds asymptotically (very large number of agents and repeated interactions) with a simplified representation of the environment.By using the analytical model to build new simulations with a moderate number of agents, we can also check that this simplified model can be used in contexts that can be interpreted as real-life situations, where convergence occurs in a population of reasonable size and a er a reasonable number of pairwise interactions.
. The mechanism we identified in this study can be explained in a simple way: agents refuse to communicate with others when beliefs are too far apart.Initially "moderate" (moderately knowledgeable) agents ensure interactions between the informed and the ignorant.However these agents eventually adopt the views of the better informed agents since the latter are more convincing.However a group of agents with very low knowledge is le behind and no intermediate agents are le to ensure communication between them and the informed agents.In some sense, all those whose initial beliefs are not too incorrect in the beginning will learn the true state quickly but the others are le behind.It is interesting to note that the rapid initial success of the informed Here the relative influence of level 2 agents is much higher than the relative influence of level 1 agents, and this results in very bad learning in the group.
agents may not be e icient for learning in the population as a whole in the long run.The moderately informed agents learn quickly but this leaves the mistaken agents isolated and creates polarization.
. It is interesting to note that what causes the bad learning dynamics is mainly the lack of moderately knowledgeable agents, rather than the lack of highly informed agents, or a high number of agents whose initial beliefs are incorrect.It is not necessarily good for global outcomes that perfectly knowledgeable agents exert a strong influence, and thus that the power of persuasion increases too much with the quality of knowledge.
. We can express su icient conditions for bad collective learning: • if the initial fraction of agents with an intermediate level of knowledge is small • the likelihood of persuading ones partner is convex with respect to the level of knowledge . The analytical study of a simplified model is what allowed us to make this mechanism visible, and thus to show that communication barriers is a major issue in the management of collective learning.However, the result about convexity which holds in the simplified model does not necessarily hold in a more complex representation of the belief environment.Indeed, as can be seen in Rouchier & Tanimura ( ), the convexity and concavity do not produce straightforward properties, since increasing the convexity by starting from a linear reaction to the environment does reduce learning, but moving from the linear to the concave case does as well.In our complex setting the best learning occurs when the success of transmission of good information is linear in the quality of this information.Intuitively, in the new setting this can be seen as corresponding to a situation where (P 1 − P 0) = (P 2 − P 1), but it is rather clear that it is not possible to translate the representation in one model to the other as easily.Hence, the main result can be explained and proved, but the same is not true for the more detailed properties of the original model.Indeed, one has to bear in mind that the main di erence between RT and this analytical model is the notion of "correct" and "incorrect" beliefs.In the complex setting, there is one way to be correct, but many ways to be incorrect.

.
As is usually the case in a model-to-model comparison process, the analytical model is very di erent from the initial simulation model, and requires a completely new way of phrasing the problem.This second step can be made only when the simulation model has provided us with some intuitive hypothesis that can then be verified in a much simpler setting.Developing a new model which conserves important features of the original one but is more tractable is an interesting creative challenge which also requires finding appropriate analytical methods for studying models and problems originated in agent based modeling.In our case, where there is randomness in the encounters that occur and in their outcomes, it was natural to take a probabilistic approach, focusing on "typical" outcomes in the long run.

No communication barrier random walk
Let us first consider the case where there are no communication barriers.In this situation, dynamics end in one of three possible states N 0 (∞) = N, N 1 (∞) = N or N 2 (∞) = N .Let us show that if N is su iciently large, with high probability we reach an outcome where all agents learn the true state of the world., i.e. the last case.
We will be interested in N 2 (t) the number of agents with a correct belief.Let us consider only the transition of agents in and out of level .We disregard whether the agents who are not in level are in level or level .When an encounter occurs between an agent in level and an agent in level , the probability that the agent in level moves to level is p 2 , the probability that an agent in level moves to level is p 1 and the probability that nobody moves is 1 − (p 1 + p 2 ).Conditioning on the event that someone moves, the probability of moving to level is p 2 /(p 1 + p 2 ) and the probability of moving to level is p 1 /(p 1 + p 2 ).(the steps where nobody moves a ect convergence time but not the movement of the dynamics and can be ignored).Similarly we can consider the probability of moving from level to level and conversely, when a movement occurs, the probability of moving to level is p 2 /(p 0 + p 2 ) and that of moving to level is p 0 /(p 0 + p 2 ).The probability of moving to level

Figure :
Figure :This image shows the final setting of a simulation (in the center), with green agents being level 2 agents and red ones being level 0 agents.The dynamics of the number of agents at each level can be seen on the top right graph, where the yellow curve represents the number of level 1 agents.Here the relative influence of level 2 agents is much higher than the relative influence of level 1 agents, and this results in very bad learning in the group.

Figure :
Figure :This image shows the final setting of a simulation (in the center), with green agents being level 2 agents and red ones being level 0 agents.The dynamics of the number of agents at each level can be seen on the top right graph, where the yellow curve represents the number of level 1 agents.(P 1 − P 0) − (P 2 − P 1) is not high enough to have complete circulation of knowledge in the group.