Problem Solving: When Groups Perform Better Than Teammates

: People tend to form groups when they have to solve difficult problems because groups seem to have betterproblem-solvingcapabilitiesthanindividuals. Indeed, duringtheirevolution, humanbeingslearnedthat cooperation is frequently an optimal strategy to solve hard problems both quickly and accurately. The ability of a group to determine a solution to a given problem, once group members alone cannot, has been called “Collective Intelligence". Such emergent property of the group as a whole is the result of a complex interaction between many factors. Here, we propose a simple and analytically solvable model disentangling the direct link between collective intelligence and the average intelligence of group members. We found that there is a non-linear relation between the collective intelligence of a group and the average intelligence quotient of its members depending on task difficulty. We found three regimes as follows: for simple tasks, the level of collective intelligence of a group is a decreasing function of teammates’ intelligence quotient; when tasks have intermediate difficulties, the relation between collective intelligence and intelligence quotient shows a non-monotone behaviour; for complex tasks, the level of collective intelligence of a group monotonically increases withteammates’intelligencequotientwithphasetransitionsemergingwhenvaryingthelatter’slevel. Although simple and abstract, our model paves the way for future experimental explorations of the link between task complexity, individual intelligence and group performance.


Introduction
. The whole is more than the sum of its part (Anderson ; Baumeister et al. ) is a leitmotif of complex systems research being at the root of the concept of emergence. This sentence has been invoked in many scenarios to explain the onset of ordered self-organised structures (Nicolis & Nicolis ). In this work, we propose an application of the former "motto" to social sciences, in particular to the process of task solving. Indeed human beings, like many other social animals, can organise themselves into groups to solve tasks that the single individual is unable to complete (Smith ). This can be an incremental support: joining their strengths agents will be capable to overcome the hurdle each one could not separately achieve. It can also be coordination step where each agent will bring her own knowledge to create something new: a (social) group knowledge capable to solve higher level tasks. The latter can be named Collective Intelligence (CI), and it is a measure of the advantage of being in a group compared to isolated individuals. In this view, (Szuba ) defined the CI as the property of a social structure that originates when individuals interact and results in the acquisition of the ability to solve new or more complex problems. .
The process of social problem-solving that the group implements in order to solve higher-level tasks is the result of the group members cooperation, competition and of their abilities in sharing knowledge. Thus, people tend to turn to groups when they have to untangle complex problems because they believe that groups have better problem-solving skills than a single individual (Forsyth ). Researchers found that self-organisation of human crowds, improved nowadays because of the communications technologies that simplify information spread, can lead to original ideas and solutions of notoriously hard problems. These include designing RNA molecules (Lee et al. ), computing crystal (Horowitz et al. ), improving medical diagnostics (Kurvers et al. ), predicting protein structures (Cooper et al. ), proving mathematical theorems (Gowers & Nielsen ) and collaborative mathematics such as the polymath blog (Gowers ) or even solving quantum mechanics problems (Sørensen et al. ). .
As previously stated, the CI can be defined as the di erence between the rate of success of the group on a specific problem and the average rate of success of its members, on the same issue. This is thus an emergent property of the group as a whole, not reducible to the simple sum of its members' individual intelligence. The available literature shows that the group performance is a ected by several factors, such as: the group members characteristics, the group structure that regulate collective behaviour (Woolley et al. ), the context in which the group works (Barlow & Dennis ), the cognitive processes underlying the social problem-solving reasoning (Heylighen ), the average of members' Intelligence Quotients (Bates & Gupta ) and the structure (Credé & Howardson ; Lam ), and the complexity of the problem that should be solved (Guazzini et al. ; Capraro & Cococcioni ; Guazzini et al. ; Moore & Tenbrunsel ). .
During the past few years, the CI has attracted a large interest from the scientific community, notably from the empirical research side, as individual intelligence did in the last decades. The individual intelligence has been defined as the ability of human beings to solve a wide variety of tasks (Gardner ). Adopting an analogous point of view, the CI has been defined as a general factor able to explain the "group performance on a wide variety of tasks" (Woolley et al. ). According to the most recent studies, the latter is able to predict about 43% of the variance of the group performance and it is strongly correlated with three di erent variables: the first one is the variance of the conversation turnover, the second one is the proportion of women in the group, and the last one is the average of members' abilities in the theory of mind (Engel et al. , ; Woolley et al. ).The same studies find that the average of the teammates intelligence quotient ( IQ ) is not a significant predictor of group performance (r = 0.18) (Woolley et al. ). .
Despite several studies brought empirical evidence about the existence of a unique factor capable of explaining a large part of the group performance, some recent works aimed at resizing the dimensionality of such a model (Graf et al. ; Bates & Gupta ; Credé & Howardson ). In particular, a recent re-analysis of the four main empirical studies in the field of CI (Barlow & Dennis ; Engel et al. , ; Woolley et al. ) does not support the hypothesis of a general factor able to explain the performance variation across a wide variety of group-based tasks (Credé & Howardson ). Studies conducted in an online environment support the claim that CI would manifest itself di erently depending on the context (Barlow & Dennis ). Furthermore, the literature suggests the existence of di erent models of CI able to explain the variance of group performance, for di erent kind of task (Credé & Howardson ; Wildman et al. ). In this regard, Lam ( ) showed how the structure of the task a ected the quality of group communications and decisions. Finally, a complete replication of some standard studies aimed to characterise and measure the CI, conducted with a di erent sample, shown that sometimes the group performance can be significantly and strongly correlated with IQ . In particular, Bates & Gupta ( ) reported that in their experiments the CI resulted completely indistinguishable from the members Intelligence Quotients.

.
Given the di erent, and sometimes contradictory, available empirical results about the CI and its characterisation in terms of relevant variables, the aim of our work is thus to shed some light on this issue using a mathematical approach. Even though the literature identified a relationship among CI, group structure and task complexity (Capraro & Cococcioni ; Guazzini et al. ; Moore & Tenbrunsel ), the limited number of studies in this field makes this dynamics still elusive. In particular, it is possible to hypothesise the existence of a non-linear interaction between the potential of the group, e.g., the average of members' intelligence, and the di iculty of the problem that the group has to solve. This interaction may explain the variance of group performance reported by literature. Thus, it could be relevant for this research field to clarify the intertwined relationship between the two fundamental dimensions introduced above (i.e., task complexity and members' intelligence) in order to develop a model allowing to understand the group performance. A possible way to achieve this goal is going through the analysis of the cognitive processes underlying the social problem-solving dynamics developed inside the group. In this regard, Heylighen (Heylighen ), thorough an interesting formal model, suggested that groups, solving a task, would develop a Collective Mental Map (CMM), as a product of the interaction between some psychosocial processes, such as the cross-cueing (Meudell et al. ) and the information and knowledge sharing. The Heylighen's framework allows to study the CI dynamics taking into account the merge of the group members' representations of the problem in a single representation labelled as Member Map (MM). The MMs are usually defined as composed by a set of problem states, a set of possible steps for the solution of the task, and a preference fitness criterion for selecting the preferred actions (Heylighen ). Here, adopting the Heylighen framework, we propose a mathematical model of CI able to shed light on the process resulting from the interaction between the average of group members' intelligence and the di iculty of the task, and thus open the way to a better understanding of the CI.
. The paper is organised as follows. In the next section we introduce the formal model based on the Heylighen hypothesis and an operational definition of CI. Then, we will present the theoretical consequences of our model and we corroborated them with some numerical results. We finally conclude our study with a discussion about the potential follow-ups of this work and the empirical studies one could organise to check its goodness.

Methods
. As already stated, our work is grounded on the Heylighen theoretical model of collective mental maps; the stylised model we derive, allows us to provide a possible explanation of the discording results recently presented in the literature about the CI and its determining factors, more precisely the existence of a correlation between the CI and the teammates IQ (Bates & Gupta ) and the absence of such correlation (Woolley et al. ). Based on simple rules inspired by cognitive processes, we will build a formal model that will help us to unravel some open issues about the emergence of CI. We anticipate that our model will be deliberately abstract to clearly identify the main drive for the emergence of the collective intelligence, but on the other hand general enough to allow for practical implementation into a real experiment.

The Heylighen hypothesis .
Assuming the Heylighen framework to study the dynamics under scrutiny, it appears necessary to distinguish between the construct of intelligence and knowledge. One of the most shared definitions of intelligence suggests separating fluid intelligence from the crystallised one. Fluid intelligence is a set of skills and abilities useful during the reasoning processes and in the acquisition of new knowledge (Bates & Shieles ; Stankov ). Crystallised intelligence is the set of already stored knowledge needful to the e ective problem-solving reasoning (Bates & Shieles ; Horn & Masunaga ). In the light of the proposed distinction between the two sub-components of intelligence, the crystallised and the fluid one, we can assume that the individual Intelligence Quotient is the result of a complex function among several factors, including the stored knowledge (Cattell ). The latter is a factor of interest because it is both cause and e ect of the IQ: the ability to acquire knowledge depends on the fluid intelligence (Beier & Ackerman ), and, at the same time, the ability to solve a wide variety of tasks, namely IQ (Spearman ), is determined by the previously stored knowledge, or crystallised intelligence (Bates & Shieles ). So, we can argue that the agents knowledge is the best marker for their intelligence .
Adopting the Heylighen topological metaphor of human intelligence, the knowledge of the i-th agent can be described by a vector made of D entries (knowledge nodes), denoting the previously introduced set of problem states or set of possible steps for the solution of the task, . In the following, we will refer to the latter as a generic agent's knowledge on a topic. To simplify the notation, we assume to normalise this vector, such that its components can assume a value in [0, 1], the smaller (resp. the larger) the value the lower (resp. the higher) is the knowledge on this specific topic. The total knowledge of an agent can be captured by the sum of the entries of her knowledge vector, j , the latter can thus be used as a proxy of the Intelligence Quotient (IQ) of the agent. Adopting such a metaphor, in Figure we report a schematic representation of the agent knowledge vector. For each of the D topics, the agent has a knowledge level schematically represented as a bar of di erent heights. .
For the sake of simplicity, we assume that an agent is able to solve a task with di iculty τ , a real number in [0, 1], if all the entries in her knowledge vector are larger than τ , namely min j K (i) j ≥ τ (see Figure ). Let us observe that the dimension D participates, even if indirectly, to make a task hard or not, indeed if D is large it can be di icult (i.e., less probable) for the agents to have all the entries of their knowledge vector larger than τ . Secondly, we remark that we can relax the definition of task di iculty by assuming the need for di erent levels of knowledge in each topic to achieve a task, that is the latter would be a D-dimensional vector, τ = (τ 1 , . . . , τ D ) ∈ [0, 1] D , and an agent would be able to solve this task if K Figure : Agent knowledge and agent solving a task. Each coloured bar schematically represents the agent knowledge on a given topic, the higher the bar the better the knowledge; the set of all the bars represents the knowledge vector of the agent the. τ (horizontal dashed line) is the task di iculty: an agent is able to solve the task if she exceeds τ in all the topics. The blue agent on the le is unable to solve the task, while the orange one on the right can do it.
sake of simplicity, we decided to adopt the former simpler assumption and we deserve the latter one for a more detailed further analysis. .
The last required ingredient is a set of rules driving the merge of di erent mental maps (agents' knowledge) into a common one (group knowledge) in order to model the group problem solving process. Since no previous research explored the connection between task di iculty and group knowledge potential in order to explain the CI dynamics, we choose to build an abstract model under the assumption of perfect communication between group members, neglecting thus in first approximation all the biases that could a ect the group discussion and decision making or problem solving. In this way, we will able to capture the main drive for the emergence of the collective intelligence. Let us observe that some of the above mentioned biases could be easily inserted in the model (see Discussion and Model Documentation Sections), we nevertheless stick to our initial choice in this first analysis. .
In particular, Heylighen suggested that groups facing with a certain task solving problem, develop CMMs (Heylighen ), that, in the absence of any communication issue and/or social hierarchy, can be obtained with agents "juxtaposing" their knowledge vectors, that is the CMM will result to be a D-dimensional vector, G = (G 1 , . . . , G D ), whose entries are the "best ones", i.e., the ones with the largest values, among the agents, more precisely G j = max i K (i) j (see Figure ). The total knowledge of a group can be measured by the sum of the agents IQ, that is IQ (G) = i IQ (i) .
. Based on the above, a group is able to solve a task of di iculty τ ∈ [0, 1] if min j G j ≥ τ . Clearly if the group contains agents capable to solve by their own a task of a given di iculty, the group would also do the same but in this case, the CI will be null because there is not an added value to be together. On the other hand, a group made by agents unable to solve individually a task of a given di iculty, but excelling in su iciently many di erent topics, could perform well and solve a problem where each agent will fail. In this latter case one can consider such achievement an emergent property of the group and assign a large CI (see Figure ).

An operational definition of Collective Intelligence
.
Given a task of di iculty τ ∈ [0, 1] in a knowledge space of D dimensions, we can define the CI as the di erence between the rate of success of the group and the rate of success of the average agent composing the group. This function depends thus on τ and D, it is non negative and positive values are associated to tasks too hard for the individual agent while solvable by the group. .
Let us consider a group made by N agents and consider a task with di iculty τ in a D dimensional knowledge space. Assume also that the knowledge of the i-th agent on the j-th topics is a stochastic variable with a prob- Figure : Collective Mental Map and and group task solving. The group is made by N agents, each one endowed with her knowledge vector, i.e. the set of coloured bars representing the level of knowledge on each topic. τ (horizontal dashed line) is the task di iculty, no single agent is able to solve the task because there are topics where she doesn't reach the minimum required level τ . The Collective Mental Map is obtained by taking the best level in each topic among the agents, this is obtained as the result of the exchanges among the agents that are assumed to free from any "transmission errors" and biases. The group is able to solve the task because it exceeds the threshold τ in all the topics.
Without loss of generality, we hereby hypothesise that p(x) is the same for all topics; observe however that this working assumption will not substantially modify our conclusions. One can thus determine (see Appendix A for more details) the probability for the i-th agent to exceed a level τ of knowledge on the j-th topic. From the assumption that the entries of the knowledge vector are i.i.d. random variables, we can obtain the probability, π (i) (τ, D), that the i-th agent is able to solve the D dimensional task characterised by a di iculty τ , namely to exceed the level τ on all the D dimensions of the task.
. By its very first definition, the Collective Mental Map is obtained by letting the agents to interact, compare and exchange their knowledge levels; eventually determining the group knowledge vector obtained by taking the largest values among all the agents knowledge vectors, across each dimension of the task. Hence from the previous result concerning each single agent, one can straightforwardly compute the probability distribution, say Π N (τ, D), for the j-th component of the Collective Mental Map to be larger than τ (see Appendix A). .
Finally the Collective Intelligence for a group of size N would result the di erence among the previous two functions:

A Simple Agent Based Model
. The process described in the previous section can be schematically represented by the flow diagram reported in Figure and described in full with the algorithm presented in the Model Description Section. To initialise the model, we have to set the number of agents, N , the size of the knowledge vector, D, and the task di iculty, τ . Then, we have to fix the probability distribution of the entries of the knowledge vector, let us observe that we also allows for the possibility of choosing the number of agents actually discussing simultaneously, 2 ≤ k ≤ N . Therefore, the process goes on and we can extract the results. .
As already stated, we decided to concentrate on small groups, e.g. made by N = 5 agents, to assume perfect transmission among agents and avoid any hierarchical structure in the group. Focusing on a small number of agents would allow to transfer our conclusions and test them with real experiments, which typically involve small size groups, whose dynamics are simpler than larger groups, being the former characterised by no division in subgroups, rapid communication, absence of delay or memory e ects, just to mention few aspects. These assumptions can be implement in a simple ABM as follows. The N agents meet and discuss about one among the D topics, they have in their knowledge vectors; such topics, say j, are randomly chosen with uniform probability from the D available ones. At the end of the interaction each agent leaves the group having learnt the highest value on the topics under discussion from the teammates, i.e. each agent replaces her K (i) j with the max i K (i) j . One time step is fixed by performing su iciently many meetings is such a way that all the D topics have been discussed at least once (see Figure : The flow diagram. We schematically represent the main steps of the (social) problem solving process. In the block named "Problem solving: Agents", each agent tries to solve the task independently from the group. The phase called "CMM construction" could be assimilated to an ABM where teammates meet, compare and exchange their knowledge values, namely they organise a meeting. The last part, "Problem solving: Group" deals with the whole group facing to the task and using the knowledge merged during the previous step. See also algorithm presented in the Model Description Section.

.
Of course, we could relax the assumption that all agents interact at the same time and on the contrary hypothesise that only k agents discuss simultaneously (2 ≤ k ≤ N ). In our opinion, this scheme will be relevant once N is large such that many body interactions are forbidden by the system size and the cognitive capacities of agents. These limitations led us to avoid such a scheme.
. At the beginning of the process, we evaluate each individual agent against the task. Then a er the meetings, we evaluate the whole group against the task and measure the (possible) improvement of the group over the single agent, that as previously stated is a proxy for the CI.

Results
. The ABM described in the previous section is schematically represented by the flow diagram reported in Figure  and presented in the algorithm in the Model Description Section. This model can be analytically solved as we will hereby show. First, we fix the distribution of agent knowledge; we assume agent knowledge on each topic to follow an unimodal distribution with a peak at some intermediate value β ∈ (0, 1), denoting thus the most probable level of knowledge for each topic. For the sake of simplicity, we assume each entry in agent knowledge vector to follow a "tent distribution" with parameter β (see Figure ). In the Appendix C we briefly present the case of uniformly distributed agents knowledge and we can observe that the results are qualitatively similar to the ones hereby presented.
.  results to be

The CI (given by Equation in Appendix
To check the goodness of Equation and Equation , we performed some dedicated numerical simulations of the ABM involving a small group made by N = 5 agents, each one endowed with a D = 5, 10 and 20 dimensional knowledge vector, whose entries are drawn according to the tent distribution with parameter β. We also let the group have a su icient number of meetings to discuss on every topics (see Appendix B for more details). Results are reported in Figure , in all the cases the CI has a similar behaviour: it starts very small, then it increases up to a maximum value for a given value of the task di iculty to then decrease again toward zero.
. Indeed, for very easy tasks, τ 1, the CI is very small: both the average agent and the group are able to solve the task and thus there is no gain in participating to a group discussion. As the di iculty increases, the average agent is less and less capable to solve the task while the group perform excellently. For much larger thresholds, nor the agent nor the group are able to solve the task and thus CI gets again very low. .
The IQ of an agent with a D-dimensional knowledge space can be exactly computed under the assumption of values distributed accordingly to the tent distribution as follows: Collective Intelligence as a function of the average agent IQ in a small group. We show the CI for a group made by N = 5 agents solving an easy task (τ = 0.1 red circles), a simple level task (τ = 0.3 green diamonds) or a more di icult one (τ = 0.6 blue triangles), in all cases D = 10. One can observe three di erent scenarios according to the task di iculty determining completely di erent behaviours for CI as function of IQ .
The "bell-like shape" is universal across the di erent parameters range, for τ 1 almost all the teammates are able to solve the task and so does the group but there is no incentive to be in the group and so CI ∼ 0. At the other extreme, τ ∼ 1, nor the agents nor the group are able to solve the too di icult task and so again CI ∼ 0. The intermediate range is the more interesting, teammates cannot solve the task while the group can, the CI emerges and takes large values. To reduce the stochastic e ects of the model each point has been obtained as the average over 100 independent replicas.

.
It is thus an increasing function of the parameter β. We can hence study the dependence of the CI on the average IQ, for fixed task di iculty τ and dimension D, by varying β. Results are reported in Figure for an easy task (τ = 0.1 red circles), a simple task (τ = 0.3 green diamonds) and a more di icult one (τ = 0.6 blue triangles). One can observe three clearly di erent behaviours; in case of the simple task, the average agent is not able to solve the task once the average IQ is small while the group does, hence CI is positive and increasing because the group still performs better than the average agent even for larger IQ. Once IQ increases even more, the average agent fills the gap and she starts to perform better, the CI thus decreases because there is no longer an incentive in being in a group. On the other hand, in case of more di icult tasks, the CI is always positive and increasing meaning that the average agent never performs as well as the whole group. While for an easy task, the CI is always positive but decreasing meaning that the group is always able to solve the task and the average agent very soon is able to do the same. .
To provide a global view we report in Figure the CI, for a group made by N = 5 agents, as a function of τ and D for two values of the β-parameter of the tent distribution. One can observe that as the dimensionality D increases, the group will always perform better than the single agent even for small τ , indeed the blue zone on the le part of each plot shrinks as D 1. Comparing both panels of Figure one can conclude that the group performs better than the single agent, i.e., it is able to solve more di icult tasks -large τ and D -as β increases; this somehow counter intuitive phenomenon can be explained by the fact that even if the average agent has a large IQ for large β, the high dimensionality of the task makes almost impossible for the agent to exceed in all the topics and thus she will be unable to solve the task. This is not the case of the group because it will gather the best from each agent.
. This non monotone behaviour of CI as a function of the average IQ can explain the di erent experimental results available in the literature, where some authors find an increasing correlation between CI and the average IQ, such as the ones reported by (Bates & Gupta ) while other ones find a weak correlation between the same variables, see for instance (Woolley et al. ). Our model and the resulting analysis suggest that both studies are right. The point is that the chosen groups (or the tasks) were sitting in di erent locations of the parameter space: one where the relation is increasing (see triangles or diamond curves in Figure ), the other where it is decreasing (see circle or diamond curves in Figure ), but the underlying mechanism has always been the same. Figure : Collective Intelligence as a function of the task di iculty (τ, D). We show the dependence of the Collective Intelligence on the task di iculty, τ and D, for a group made by N = 5 agents whose knowledge vectors are distributed according to a tent distribution with parameter β = 2/3 (le panel) and β = 1/3 (right panel). Now it is possible to design new experiments where we can control and tune the above variables.

Discussion
. The present study aimed to develop an abstract formal model to analyse the Collective Intelligence process, namely the ability of a group to better perform on problem solving than each isolated teammates. Our formal model was based on a minimal set of fundamental assumptions derived from the literature (Woolley et al. ; Bates & Gupta ; Heylighen ), and analysed the (complex) interaction existing among the task di iculty, the average teammate intelligence and CI. In particular, the model focussed on the interaction between the agent ability to solve a task, i.e., her intelligence, and CI, namely the group capability to solve the task. .
The results obtained from our ABM, corroborated by analytical ones, supported the hypothesis of the existence of a non-linear relation between the Collective Intelligence of a group and the average Intelligence Quotient of group members, mediated by the task di iculty under study. Indeed for simple tasks, the CI is a decreasing function of IQ , namely the combination of agents' knowledge rapidly becomes redundant because of the "simplicity" of the problem to solve. A second and more interesting regime emerges for tasks with intermediate di iculty. Within this regime, the relation between CI and IQ exhibits a non-monotone behaviour: initially, CI increases with IQ but then a tipping point is reached, beyond which an increasing IQ produces a decreasing CI. In this regime, there is an optimal combination of task di iculty and average group IQ. A third regime shows that for tasks with a strong di iculty, the CI monotonically increases with IQ . Moreover, a (almost sharp) phase transition emerges by varying IQ : beyond this value the group would be always much better than its members, even if the rate of successes would decrease because of the task complexity itself.
. We deliberately built an abstract and simple ABM to unravel the role of each constituting element considered. Every model is a partial representation of the reality, we had thus to leave out some factors, that would be considered in future work, such as the imperfect information transmission, group structure, kind of leadership and specific features of the agents (e.g., teammates' empathy, and social abilities, hierarchical position in the group). We are aware that the latter ones represent two fundamental factors in the modelling of group dynamics. We hypothesised that the two experimental frameworks (Woolley et al. ; Bates & Gupta ) that motivated our research, have been performed on randomly assembled small groups of unknown people, without giving them any set of rules regarding communication, status and role. We can thus conclude that even if during the tasks, teammates spontaneously adopted some of such communication channels, their net e ect would average out in the repeated experiments because of the random assembling of the groups and the absence of any rule, leaving hence the sole interplay between task di iculty, IQ and CI. .
Let us however emphasise that the results of our model suggest a possible interpretation for the, apparently, contradictory results from the literature regarding the existence of a CI factor, as well as the more challenging question about its magnitude (Woolley et al. ; Bates & Gupta ). The correlation between CI and IQ appears not to be the right observable to answer the above questions, because of the presence of hidden variables among which we have pointed out the main role played by the task di iculty. First of all, the relation between CI and IQ appear to be non-linear, and as a consequence, any linear statistics approach (e.g., factor analysis) would fail to capture the problem on the whole. Second, the role of the task di iculty, which is of course frequently an elusive concept to measure in ecological conditions, appears to determine non-linear e ects on the relation between CI and IQ , changing as a consequence the magnitude and the sign assumed by the parameter of any linear statistics relating them. In particular, if an experiment is realised within the first scenario (i.e. simple tasks, e.g. circle curves in Figure ), the sign of a correlation statistics between CI and IQ would be negative, while it would be positive within the third scenario (i.e., very hard tasks; e.g., triangle curves in Figure ). Finally, within the second scenarios, a small or even an absent correlation between CI and IQ would appear; small correlation would result from a positive correlation for small values of IQ , followed by a negative one for larger values, e.g., see diamond curves in Figure . .
Our model allowed us to capture an interesting non-linear interaction between the potential of a group (i.e., the average of its members' intelligence), and the di iculty of a task. In particular, our numerical results seems to capture the qualitative trends provided by the recent literature (Woolley et al. ; Bates & Gupta ), confirming a good agreement between theory and experimental data. Based on our results, we propose to design experiments where the task di iculty is a controlled variable and thus check the dependence between CI and IQ . .
In order to explain our results according to the Heylighen framework of Collective Mental Map, we hypothesised that the process of merging of subjects' mental maps would introduce several non-linear e ects simplified by the proposed linear analysis provided by previous studies (e.g., Woolley et al. ; Bates & Gupta ). Even assuming perfect communication between subjects, who would not be a ected by any bias in group interaction (e.g., hierarchies, roles, leadership, status), our implementation of the Heylighen CMM shows that the IQ of the most intelligent member of the group should be the most important factor to solve simple tasks (Bates & Gupta ), but no longer strongly related with group outcomes for intermediate task complexity. According to our results, this can happen when the task di iculty overcomes the capacity of most intelligent member, who still would need a support from other members to build a collective mental map to solve the task. As a consequence, we argue that the tasks proposed in Woolley et al. ( ) were appropriate for the level of intelligence of the sample used in these studies, fitting thus with our second regime. On the other hand, the same tasks could be too simple for the sample of the second study (Bates & Gupta ) and the average IQ of the teammates was too high to solve them.
. In conclusion, we proposed a simple and abstract model able to explain some relevant results in the literature about group problem solving (Woolley et al. ; Bates & Gupta ). We are notwithstanding aware that we did not consider certain important actions that groups adopt to increase collective performance once faced to such problems, e.g., division of labour, hierarchies and cooperation devices, just to mention few. However, we believe that the latter ones can be considered sort of "second order corrections" and so are confident that our model includes the main ingredients and with the su icient level of abstraction to describe interaction in small groups as found in experiments, where agents cannot explore complex behaviours. We also simplified interagent communication by excluding features such as communication biases or hierarchy, we nevertheless show that these features could be easily implemented in our model in future developments.

Model Documentation
We used Matlab (MATLAB ) to develop our codes for both the ABM and the analytical study. In this way, we have a full control on the whole framework and we can adapt it at our will. The core of the ABM is presented hereby (see the algorithm) using Matlab syntax; such main module is the used by varying the several parameters to perform the numerical simulations presented in the work.
As already stated, our model su ers from some limitations in the way exchanges among teammates can arise in real cases, e.g., we assumed perfect information transfer. Let us however observe that such transmission bias can be straightforwardly included into the model, e.g., it is su icient to modify line 20 by adding a noise term once computing the maximum values of agent knowledge values on the discussion topic, the size of the noise will be a proxy for the transmission bias. So in conclusion, although simple and abstract, our model is flexible and can be easily adapted to include new factors.
Algorithm ABM: group interaction : D = 5; % Size of the knowledge vector tau = 0.1; % the task di iculty N = 5; % group size k = 5; % meeting size % if k < N , then agents meet in subgroups beta = 1/3; % parameter of the tent distribution : team = zeros(D, 1); % this is the knowledge vector of the team A = zeros(D, N ); % the matrix A contains the knowledge vectors of the agents as columns : for kk = 1 : N do : x = tentdraw(D, beta); % draw D numbers according to the tend distribution with parameter beta : A(:, kk) = x ; : end for : % ABM meeting : N iter = 500; % number of interactions among agents : for hh = 1 : N iter do : % create a group of k agents : if k < N then : idx = ones(1, k); : while (length(unique(idx)) = k) do : idx = ceil(N * rand(1, k)); : end while : else : idx = [1 : N ]; : end if : jx = ceil(D * rand(1)); % select the discussion topic among the D available ones : maxAg = max(A(jx, idx)); %the maximum value of the jx-th entry of the knowledge vector : % is computed among the idx-interacting agents : team(jx) = max(team(jx), maxAg);% improve the team knowledge on the jx-th entry of the knowledge vector : end for : scoreAgent = sum(A > tau) == D; % compute the score of each agent, ie if the agent knowledge is above the threshold on each component : scoreT eam = sum(team > tau) == D; % compute the score of the team, ie if the team knowledge is above the threshold on each component The aim of this section is to provide all necessary mathematical details to compute the CI starting from the probability distribution of the knowledge of each agent. Consider thus a group made by N agents and assume to deal with a task with di iculty τ in a D dimensional knowledge space. Assume also that the knowledge of the i-th agent on the j-th topics is a stochastic variable with a probability distribution p(x) with support [0, 1], hence the probability the i-th agent exceed a level x of knowledge on the j-th topic is given by: We can thus conclude that the i-th agent is able to solve the D dimensional task characterised by a di iculty τ , with probability ( ) By its very first definition, the probability distribution of the j-th component of the Collective Mental Map is given by: and thus the probability for the group to solve the D dimensional task with di iculty τ is Finally the Collective Intelligence for a group of size N would result the di erence among the previous two functions:

Appendix B: More details about the ABM
As already stated the ABM has been deliberatively chosen to be simple enough to be able to control the impact of the di erent parameters and to obtain an analytical understanding of the process. This last fact relies strongly on the number of meetings we let agents to have in order to discuss about the topics and thus to modify their knowledge vectors. The aim of this section is thus to show some results in this direction.
Roughly speaking, the analytical solution provided in Appendix A is based on the assumption that agents have met su iciently many times to discuss any of the D topics of their knowledge vector; indeed, if a topics has never been considered, then the group doesn't have any knowledge about it.
By definition, we assume that all agents do attend a team meeting, hence a first bound on the previous condition can be mathematically formulated by solving follows problem: Which is the probability, P N,M , that drawing M numbers, with reinsertion, from the set {1, . . . , N } will return each number at least one time? Requiring a bound for such probability to being larger than a given threshold, say 1 − ε, will determine the number of meetings, i.e., M , once we have fixed N . Mathematically one can prove that So one can determine M such that P N,M ≥ 1 − , obtaining thus a bound on the number of meetings needed to get an agreement between the analytical results and the ABM. In Figure we show the error, computed using the 2-norm, between the ABM and the analytical results as a function of the number of meetings once all the agents attend the exchange phase for di erent lengths of the knowledge vectors, whose entries are distributed according to a tent distribution with parameter β = 1/3. As expected, for a given size of the error, the larger D the larger is the number of meetings needed.
Another quantity that a ect the agreement between the ABM and the analytical solution is the number of agents that participate to the groups meeting; indeed if a number k, strictly smaller than N , of agents do attend, it can happen that agents with large entries of their knowledge vector will not share their views and thus the group will not reach the best level. Again, to circumvent this issue one has to let the agents to meet su iciently many times. In Figure we report the error, computed using the 2-norm, between the ABM and the analytical results as a function of the number of meetings once k < N agents attend the exchange phase, k = 2 (red dots) and k = 3 (blue dots), for a group made by N = 5 agents each one endowed with a knowledge vector of size 10 whose entries are distributed according to a tent distribution with parameter β = 1/3. As expected, for a given size of the error, the smaller the size of the discussion group, the larger is the number of meetings needed.
Figure : Impact of the meeting size. We compare the numerical results versus the analytical ones as function of the number of meetings once groups of size 2 (red dots) or size 3 (blue dots) are formed from a pool of N = 5 agents.