Lattice Dynamics of Inequity

We discuss amodel of inequity based on iteration of the Nashmulti–agent bargaining game on a lattice. Agent’s choices are based on a logit function and gradual decay of memories of past profits. Numerical simulations demonstrate the stability of various dynamical regimes, such as disorder, fairness or inequity, according to parameters and initial conditions. When playing the game on a lattice i.e. using neighbouring agent interactions instead of random interaction among the whole agent population, one observes spatial domains and specific patterns in addition to the temporal convergence toward attractors observedwhen interactions involve any pair of agents. A result specific to the network topology is the co-existence of domains with di erent regimes, allowing the emergence of the inequity condition even in the absence of tags.


Introduction
Background on modeling inequalities and classes and the use of dynamics on lattices . The question of inequalities and classes has received a lot of attention from social scientists and modelers. Early influential precursors were Wilfried Pareto with the wealth distribution curve ( ) and Thomas Schelling with the study of segregation ( ; ).
. The framework of multi-agents iterated games is o en used. In the tribute game, one agent requests tribute, and the other agent either pays the tribute or has to fight. The chances to win the fight depends upon the relative strength of the agents, a quantity which is increased for the agent who wins the fight and receives the tribute; it is decreased for the other agent. Axelrod ( b); Bonabeau et al. ( ) have shown that when iterated, the tribute game results in the dominance of some agents over others.

.
In the bargaining game introduced by Nash Jr ( ), agents have to share a pie. They independently fix the fraction of the pie that they demand. If the sum of their demands is less than the size of the pie, their demands are satisfied. Otherwise, they get nothing. When the game is iterated among a society of agents, Axtell et al. ( ), Bowles & Naidu ( ) and Weisbuch ( ) have shown that di erent outcomes can be obtained. Societies can be either egalitarian or structured in classes. In their paper entitled "Emergence of Classes in a Multi-Agent Bargaining Model", Axtell et al. ( ) highlight the role of a priori neutral tags in favouring the emergence of classes. Poza et al. ( ) and Santos et al. ( ) proposed extensions of their model to respectively lattice interactions and small world networks. .
Another approach to the dynamics of Nash bargaining game was proposed by Skyrms ( ) who calls the game "Divide The Cake". Skyrms ( ) uses replicator dynamics based up biological evolution metaphor: Agent learning is replaced by a selection process according to which the most successful agents replicate faster.

.
Weisbuch ( ) revisited Axtell et al. ( ) using more elaborate memorisation and decision models. His version to be further called BOMA (for Boltzmann Moving Average) has shown that the outcome of the iterated bargaining game is strongly constrained by the initial conditions: he then interpreted this result as demonstrating the persistence of inequity rather than the emergence of classes.
. The purpose of the present paper is to study the dynamics of Weisbuch ( ) version of the iterated bargaining game on a lattice. Many game theoretic models display new properties when played on a lattice. The best known example is the imitation game called in its binary version the voters model. Agents on a lattice take the majority opinion among their neighbours' opinions. Rather than the uniform yes or no opinion obtained when agents follow the majority of all other agents, dynamics on lattice results in the segregation of opinions in uniform domains in the lattice as discussed in the review of Castellano et al. ( ). Epstein ( ) has shown that when agents play the prisoner's dilemma game on a lattice domains of cooperation can emerge which is never the case in the absence of a social network. Following the above two references, we might expect domain formation and possibly new dynamical properties or new conditions for the emergence of significant behaviours by running simulations on a lattice.
. We start with a very brief exposition of the papers of Axtell et al. ( ) and Poza et al. ( ). We then introduce the BOMA model of Weisbuch ( ) and give the main results obtained in a fully connected network, i.e. when any agent can play with any other agent. Section describes the application of the BOMA model to the lattice topology. Various spatio-temporal patterns are observed depending upon parameters and initial conditions. These dynamical regimes are interpreted in terms of social norms, such as equity or inequity. We further report the transitions among dynamical regimes according to changes in parameters. The influence of arbitrary tags introduced by Axtell et al. ( ) is discussed in Section . Section summarises possible interpretations in social and political Sciences. Appendix A provides further details about dynamics in the transition neighborhood. Appendix B provides programming details.

The original model of Axtell, Epstein and Young .
Let us briefly recall the original hypotheses and the main results of Axtell, Epstein and Young (Axtell et al. ).
• Framework: pairs of agents play a bargaining game introduced by Nash Jr ( ) and Young ( ). During sessions of the game, each agent can, independently of his opponent, request one among three demands: L(ow) demand perc. of a pie, M(edium) perc. and H(igh) perc. As a result, the two agents get at the end of the session what they demanded when the sum of both demands is less than the perc. total; otherwise they don't get anything. The corresponding payo matrix is written in Table . At each step, a random pair of agents is selected to play the bargaining game. The iterated game is played for a large number of sessions, much larger that the total number of agents which can then learn from their experience how to improve their performance.
• Learning and memory: Agents keep records of the previous demands of their opponents, e.g. for previous moves.
• Choosing the next move: at each iteration step, pairs of agent are randomly selected to play the bargaining game. They most o en choose the move that optimises their expected payo using the memory of previous encounters as a surrogate for the actual probability distribution of their opponent's next moves. With a small probability , e.g. . , they choose randomly among L, M, H.
. The main results obtained by Axtell et al. ( ) from numerical simulations are: • They observe di erent transient configurations which they interpret as "norms", e.g. the equity norm is observed when all agents play M.
• Their most fascinating result is obtained when agents are divided into two populations with arbitrary tags, e.g. one red and one blue. When agents take into account tags for playing and memorising games (in other words when agents play separately two games, one intra-game against agents with the same tag and another inter-game against agents with a di erent tag), one observes configurations in the intergame such that one population always play H while the other population plays L; they interpret such inequity norm as the emergence of classes, the H playing population being the upper class. ) on a small world network as opposed to the full connection structure used by Axtell et al. ( ).  ) with a payo matrix written in Table , but uses di erent coding of past experience (moving average of past profits) and choice function (Boltzman function), hence the acronym BOMA for the model.

.
The model is derived from standard models of reinforcement learning in cognitive science, see for instance Weisbuch et al. ( ). .
Rather than memorising a full sequence of previous games as in Axtell et al. ( ), agents learn and code their memories as three "preference coe icients" J j for each possible move j, based on a moving average of the profits they made in the past when playing j. J 1 is the preference coe icient for playing H, J 2 for M and J 3 for L. Following time interval τ a er a transaction J j are updated according to: The decrease term in (1 − γ) corresponds to discounting the importance of past transactions, which makes sense in an environment varying with the choices of the other players. π j (t) is the actual payo made during the chosen transaction j; the other J j corresponding to the other choices j are simply decreased by the factor (1−γ). We o en use γ = 0.1: each time an agent is involved in a game, the previous profits are decreased by perc. This roughly corresponds to a perc. decrease of the contribution of a transaction to J j a er games in which the player was involved. .
These preference coe icients are then used to choose the next move in the bargaining game. Agents face an exploitation/exploration dilemma: they can decide to exploit the information they earlier gathered by choosing the move with the highest preference coe icient or they can check possible evolutions of profits by randomly trying other moves. Rather than using a constant rate of random exploration as in Axtell et al. ( ), the probability of choosing demand j is based on the Boltzmann choice, also called logit function (Anderson et al. ): where β, the discrimination rate, measures the non-linearity of the relationship between the probability P j and the preference coe icient ). It corresponds to a gradual rather than abrupt decrease of previous memories, it is based on agent's own experience in terms of profit rather than the observation of her opponents' moves and it uses less memory. The coding of agent's memory by a vector rather than the use of a sequence of x memory size matrix used by Axtell et al. ( ) allows to conveniently relate the observed dynamics to initial conditions which would be quite di icult with the matrix coding.
• Boltzman choice has a random character as the constant probability noise introduced in Axtell et al.
( ), but furthermore the choice depends upon the di erences in experienced profits; we might expect agents to be less hesitant when their previous experience resulted in very di erent preference coefficients.
Main results of the Moving Average-Boltzmann choice model for the well mixed case .
These are the main results of the Moving Average-Boltzmann choice model for the well mixed case: • A single reduced parameter β/γ governs the dynamics. .
• A disordered state is observed such as agents do not have fixed preferences when β/γ ≤ 4.
• When β/γ ≥ 4 the dynamics evolve towards well characterized attractors such that agents have strong preferences: a "fair" attractor such that all agents play M, a "timid" attractor such that all play L and an "unfair" or "inequity" attractor such that some play H while others play L. Which attractor is reached depends upon the unique reduced parameter β/γ and upon initial conditions characterised by the initial distribution of preference coe icients.
• When tags are introduced, agents might play M when opposed to agents with the same tag, while they play H against agents with a di erent tag, who play themselves systematically L, a condition called "discriminatory norm" by Axtell et al. ( ). However, the occurrence of this asymmetric attractor depends upon initial conditions which should already be biased towards such an asymmetry. Weisbuch ( ) then interpreted this e ect as a "persistence of discrimination" rather than "emergence of classes" as Axtell et al. ( ) did.

The Moving Average-Boltzmann Model on Lattice
. Each agent occupies a cell in a square lattice, one agent per cell. We used either von Neumann neighborhood, each agent interacting with neighbors to be further called vN ( .
The initial conditions concern the J coe icients which are randomly set according to uniform distributions. An initial condition noted hml means that J 1 was randomly chosen between and h, J 2 between and m and J 3 between and l. We used for instance initial conditions and .
. At each iteration step, a pair of neighboring agents is randomly chosen to play the bargaining game; they chose a move H, M or L according to the logit rule (Equation ) and update their J coe icients according to their profit (Equation ). A new random pair is then selected to play the game and so on ... until some kind of stability is achieved. Since the choice process is probabilistic, stability here refers to the invariance of the distributions of J j a er large iteration times. Typical iteration times to check stability are games per agent. A human agent playing one game per day during years would only play games.
. Simulations on lattices display common properties with those observed in the well mixed topology.
• Di erent dynamics regimes are observed depending upon parameters values: disordered regimes at lower β/γ values and several ordered regimes at higher β/γ values. Regimes are separated by transitions.
• Agents have strong opinions about their choices for higher β/γ values: J's most o en have only one non zero component which saturates at where < π i > is the average payo obtained by the agent. .
The new feature displayed by lattice dynamics is the spatial arrangement of choices in patterns in the ordered phases. .
The results are presented on figures , , , etc. as snapshots of the lattice a er long iteration times representing agents J coe icients according to an rgb color code such that the r coe icient codes for J 1 , the g coe icient codes for J 3 and the b coe icient codes for J 2 . For instance, a red cell indicate that the agent plays H, a blue cell M and a green cell L. Let us here recall two important conventions for the presentation of results: • The initial conditions o en determine the outcome of the dynamics: we use the hml notation where the figures are upper limits of the uniform distributions of preference coe icients.
• Rather than varying the two model parameters β and γ, we only vary β with a fixed value of γ = 0.1 because the formal analysis of the mean field approximation in Weisbuch ( ) demonstrates the existence of a single reduced parameter β/γ which allows to rescale all results when γ = 0.1. More precisely, the fraction of agents playing H, M or L and the positions of the transitions depend upon β/γ, while the magnitude of the preference coe icients J are inversely proportional to γ and the relaxation time towards equilibrium is proportional to β. We checked the validity of this prediction during simulations; its limits are discussed in the appendix.  For higher β/γ values most cells display pure colors red, blue or green. The agents have strong opinions and always play the same move, H, M or L. As in the well known Schelling ( , ) model of segregation, network interactions increase the initial di erences in agent preferences.

.
The system self-organises at large iteration times and the lattice gets divided into spatial domains. One attractor type is displayed into one domain. For instance, in the "fair" domain type, all agents always play M (see Figure upper right). Other domains display "unfair configurations" such as agents playing H interact with agents playing L. Both types of domains can coexist on the lattice and remain fixed. Their respective importance depends upon initial conditions: if they favour M moves by spreading J 2 distribution with respect to J 1 and J 3 distributions, as for initial distribution , the fair pattern M colored blue occupies most of the lattice ( .
The populations dynamics are displayed on the right plots of Figure . Only the populations of agents playing pure strategies is represented. One time unit represent on average . iterations per agent. The initial increase corresponds to agents increasing their preference towards a pure strategy and the spatial re-arrangement of choices. .

Both processes
The simplex representation was used by Axtell et al. , Poza et al. and Weisbuch ). In the case of the full connection topology represented on the le plot of Figure , nearly all moves of agents preference vectors are oriented towards the nearest vertex corresponding to pure strategies. On the other hand, for lattice topology, both processes are successively observed on the right plot of Figure : initial zigzag re-organisation to satisfy the constraints of pattern formation are followed by straight line evolution towards pure strategies. .
In fact, for a given value of β the stability of any choice depends upon the magnitude of the unique non-zero preference coe icient: since the probability to choose any other move is given by : according to Equation (< π i > is the average payo obtained by the agent). .
In the chessboard texture, an H playing agent with neighbors playing L would get an average payo of . .

.
But with neighbours she would only get . on average since she gets payo when playing with her H playing neighbors. While in the bar configuration, H playing agents are surrounded by agents playing L and playing H: their profit is . ; bar textures are then more favorable than chessboards in the von Neumann neighborhood. .
Equations ( -) are the reason for the observed stability of patterns in the multi-phase regime. At equilibrium the probability of changing one agent's choice is of the order of 8.10 −7 for H, 4.10 −5 for M and 2.10 −3 for L when β = 2 and γ = 0.1. By comparison the probability of such changes in the model of Axtell et al. ( ) are much higher, = 0.1, which explains why they report transients rather attractors. .
The influence of the neighborhood also applies to random networks. H and L payo and patterns depend in fact upon the parity of the loops the nodes are implied in. Even loops allow the HLHL alternation as in von Neumann neighborhood (see Figure , le ). However, odd loops such as those present in Moore neighborhood Figure : Neighborhoods, interaction loops and patterns. In the von Neumann neighborhood, on the le , the cell at the center has four neighbours colored in blue. It is involved in four -loops, colored in red, with its interacting neighbours. Each -loop can support an HLHL configuration, thus allowing chessboard patterns. In the Moore neighborhood, on the right, the cell at the center has eight neighbours colored in blue. It is involved in additional eight -loops colored in green with its neighbours, which can only support LHL or LLL configurations, thus excluding chessboard patterns.
do not (see Figure , right). To predict the probability of local textures around a node one can simply check the parity of the loops around it. Odd loops decrease average payo and thus the probability of alternating H and L configurations. .

The analysis is easily performed in the case of small world networks (Watts & Strogatz
) with many short loops. Santos et al. ( ) studied small world nets based on one dimensional periodic structures with -loops, in other word with large clustering coe icients: they report that this network structure favours support the equity regime in accordance with the above analysis. In small world nets with many -loops, the parity of loops is expected to allow inequity attractors. .

By contrast, in Erdős & Rényi ( ) random nets which have very few short loops parity is less important and paths with alternating H and L choices may appear. The scale free property (Albert & Barabási
) is not enough by itself to allow any prediction since it does not say anything about the clustering coe icient which determines the occurrence of -loops.

.
To summarize Section results, ordered phases display spatial domains on lattices, in fair or unfair configurations. Unfair configurations display specific textures depending upon neighborhood. Which phases are observed depends upon parameter β γ and their relative importance depends upon initial conditions.

Phase diagrams: dynamical regimes and transitions .
Phase diagrams allow a clearer view of the regimes and patterns obtained according to parameter changes. We have a choice among several quantities to monitor dynamical regimes such as preference coe icients or fractions of agents choices. We present the results obtained by monitoring fractions of agents playing H, L or M in the next figures , , and but equivalent results are obtained when monitoring the evolution of preference coe icients. . This diagram is pretty similar to the regime diagram observed for the fully connected networks described in Weisbuch ( ) except that the second phase transition from homogeneous M phase to the ordered multiphase regime is not abrupt on the lattice. It should also be noted that the decreasing β procedure is not reversible: when β is increased from . to . the system remains in the M regime a er β > 1.5, see Figure . This is because the initial conditions for the higher β values are maintained from the M regime: the system displays hysteresis. Both properties are due to the multiplicity of attractors and the metastability of the domain structure to be further analysed in Appendix A.  Table : Payo matrix rewritten with the inequity coe icient λ . The first column represents the move of the first player L, M or H. The first row represents the move of her opponent. The figures in the matrix represent the payo obtained by the first player. .
Since we have presently used only one reduced parameter, β γ , we can add another one and consider the figures of the payo matrix as parameters. Keeping the idea of symmetry between demands with respect to . we introduce the extra parameter λ which we call the inequity parameter. λ is the di erence in the payo matrix between . , the equity payo and what L and M players get. The previous simulations, figures -, were done for λ = 0.2. The generalised payo matrix is rewritten in Table . .
We then obtain a regime diagram in λ (Figure ) plotted at β = 2.0 following the fractions of agents playing H, M or L versus λ. One only observes the ordered multi-phase regime until λ ≤ 0.2. Above λ = 0.2, the smaller payo obtained when playing L is not su icient to maintain the stability of the L choice. The fraction of L players starts collapsing and so does the fraction of H players who would make no profit in their absence. Only M players survive and the equity norm is observed. One can connect this feature to the observation that people refuse to accept a too small part of the pie as studied for instance by Henrich et al. ( ) even though economists' reasoning would predict acceptance of any o er during a one shot game.

Agents with Tags Dynamics
. The most significant result in Axtell et al. ( ) concerned agents with tags. Agents are divided into two populations with arbitrary tags say e.g. one orange and one black. When agents take into account tags for playing and memorising games (in other words when each population of agents plays separately two games, one intra-game against agents with the same tag and another inter-game against agents with a di erent tag) one observes configurations in the inter-game such that one population always play H while the other population plays L; they interpret such inequity norm as the emergence of classes, the H playing population being the upper class. .
A first version of the bargaining game with tags on lattice was proposed by Poza et al. ( ) with a minor change with respect to Axtell et al. ( ): agent moves were chosen to respond to the most frequent moves of their opponent rather than optimising the best expected payo from the whole sequence of game memories. They observed the same regimes as reported by Axtell et al. ( ) or their combination.
. A couple of remarks before describing our results: • Three games independently occur with two tags A and B: A agents play an intra-game against agents A and B against B, and A and B play an inter-game against each other. Since the games are independent we don't need to give a description of the games altogether. The configuration of agents choices obtained by simulation are the union of the configurations achieved in each game.
• One can choose among di erent assignments, random or regular , of tagged agents to the cells of the lattice (see Figure ).
. Let us start with the simplest case of regular assignment of tags to cells as obtained by alternation as in Figure , right panel. Simulation results for the inter-game, orange versus black agents, are displayed on the two upper panels of Figure . Each agent has four neighbours with a di erent tag to play with. Initial conditions are for the le panel and for the right panel. The chessboard domains are clearly displayed on the right panel: in some domains "orange agents" play H and "black agents" play L, but the opposite is observed in other domains. This situation transcribes the inequity norm proposed by Axtell et al. ( ). Most agents reach their saturation values and their choices remain stable.

.
The intra-game possible configurations are those described in the previous section for the von Neumann neighbourhood. The lattice panels only have to be rotated by degrees since neighbours for intra-games are now along diagonals.
. When one starts from initial conditions with black tag agents playing mostly H and orange tag agents playing mostly L, the initial choices are re-enforced during the iteration and remain stable for time steps (Fig-Figure : Random (le pattern) and Regular (right pattern) bi-partite lattices. Orange (resp. black) tagged agents occupy orange (resp. black) squares .
ure ). One single domain of black agents playing H and orange agents playing L covers most of the lattice with a few exceptions of agents playing M and some black agents playing L.
. We also use random assignments of tags to lattice cells with Moore neighborhood since the possible a priori neighbours are restricted to an average of neighbours because only pairs of di erent agents interact in the inter-game. The same also applies in the intra-games. As compared to the previous simulations of Sections . -. and . -. with von Neumann neighborhood without tags, the average number of neighbours is the same, but the connection structure is more random. Such a choice was already made by Poza et al. ( ).

One indeed observes
When one starts from initial conditions with black tags playing mostly H and orange tags playing mostly L, the initial choices are re-enforced during the iteration and remain stable for time steps (Figure ). One single domain of black agents playing H and orange agents playing L covers most of the lattice with a few exceptions of agents playing M and some black agents playing L. Figure : Inter-dynamics on a x square lattice. γ = 0.1, β = 2, iterations per agent. M is colored blue, H is colored red and L is colored green. Orange and black characters correspond to agent tags. IC on the le panel and on the right. Upper plots where obtained with the regular alternation of orange and black agents, lower plots with random positions of tags. A few black isolated cells in the lower plots are occupied by agents surrounded by other agents with the same tag; they have no neighbor to play the inter-game with. Figure : Inter-dynamics on a x square lattice. γ = 0.1, β = 2. M is colored blue, H is colored red and L is colored green. Orange and black characters correspond to agent tags.Initial conditions are for the black agents and for the orange agents.

Discussion and Conclusions
. Let first discuss the main di erences between Axtell et al. ( ), Poza et al. ( ) and Santos et al. ( ) and the present work. Since they use di erent memorisation and choice function, one is not surprised to get some di erence in behaviour. They observe long transient regimes which they interpret as norms of behaviour while we rather observe attractors. Previous papers report the observation of "a fractious regime (FR), in which agents alternate their demands between H and L" (Poza et al. ). We never observe such a regime. This discrepancy comes from the di erence between attractors observed in our BOMA dynamics and their transient regimes due to their choice of a relatively large and constant probability of random transitions. .
Another important di erence is that once the equity norm is established at large β value it remains stable even under parameter values which would support inequity. History admittedly provides examples of transitions from more equitable situations to inequity such as slavery: to model such transitions in a BOMA perspective, one would have to add some processes to the model such as wars and invasions which indeed were at the origin of slavery. .
Our most significant result is the fact that the inequity norm can be observed on lattices even in the absence of tags. The restriction of game partners to a few and the short loops of interaction foster its emergence. The same reasoning applies to more random connection structures with short interaction loops. Interpreting this result in real societies means that we might expect the possibility of the inequity norm in strongly structured societies where the social network dominates who interacts with whom. The fact that tags are not a necessary condition for inequity on lattices does not deny their role in human societies as first stressed by Axtell et al. ( ). Racism and its consequences can certainly be interpreted in this framework. During the Middle Ages, some societies have even imposed tags such as specific garments to some of their members to discriminate against them, e.g. against Jews and against lepers. Tags and discrimination can certainly play a role in more anonymous modern and larger societies. .
The fact that large γ values drive the system towards the equity norm (see Figure ) means that equity is favoured by forgetting . This property can be related to the role of forgiveness in re-establishing cooperation in the prisoner dilemma game as reported in Axelrod ( a). .
We have also seen (see Figure ) that too much greed can destabilize the inequity norm and bring back equity, a prediction to compare with ethnological data (Henrich et al. ) and the History of Revolutions.
. We only have scratched the surface of the problem of inequity in social networks. The lattice connection structure is only a toy model of a social network. However, it still allowed us to predict the influence of loops' parity on possible local arrangements of H and L agents. These predictions should be actually tested on various versions of social nets from trees and random graphs to small world and diverse scale free networks. Another possible continuation of the present work could be to introduce a "slow dynamics" on parameters or a small "seed" domain of "opponents" to check the spatio-temporal dynamics of "revolutions" of both kinds, either towards equity or inequity.  Figure . The 'x' green curve was obtained with Moore neighborhood. The '*' blue curve corresponds to initial conditions. The purple dotted squares correspond to γ = 0.05. The magenta filled squares were obtained for the same parameters as the reference, except that β was increased rather than decreased from . to .
The influence of these factors are checked on Figure . One factor at a time is changed. They are: • Changing the neighborhood from von Neumann to Moore. The observed decrease of stability of the HL domain for M neighborhood is due to the decrease of J 1 in bar patterns as earlier discussed: the average payo for H is only . instead of . for vN neighborhood.
• Changing initial conditions from uniform to , thus lowering the early proportion of M agents, reduces the M playing domain.
• Changing γ to . instead of . while keeping the β/γ ratio constant. The stability of the HL domain is increased when γ = 0.05 instead of . since the same decrease of the J 1 and J 3 needs twice as many consecutive alternative changes as when γ = 0.1.
• Increasing β from . to . rather than decreasing it, as done for all other transitions diagrams, maintains the system in the fair attractor. Since the initial conditions are "fair" when β is increased above . , the system carries on following the fair attractor when one further increases β. The increasing and decreasing branches of the diagram are then identical when β < 1.2 but di er above . .
The observed changes with obvious explanations are those due to the change of initial conditions in the multiphase domains, namely when β is increased and when non-uniform initial conditions are chosen. The width of the transition and its translation for di erent conditions are due to the metastability of the domains in the close neighborhood of the transition i.e. when β .1.5. In that region the M phase is a strong attractor and would predominate over the H/L phase at infinite iteration time. We monitored the slow erosion of the H/L domain and its invasion by the M domain when β = 1.5 on Figure during , time steps corresponding to , iterations per agent. One can observe during the simulations that the "blue" phase of M players dissolve gradually the H/L regions starting with the irregularities of chessboard pattern until only the most regular patterns survive at large iteration times. By contrast, when β = 2 farther from the transition, the relative stability of the H/L phase is preserved. Figure : Metastability of patterns above the β 1.5 transition. Well above the transition when β = 2 the patterns remain stable over long time periods, more than 10 4 iterations per agent. The plots with patterns on the right and dynamics on the le correspond to initial conditions for the upper pattern and for the middle pattern. But closer to the transition, for β = 1.5, the chessboard pattern is metastable: it is slowly dissolved by the "blue" pattern corresponding to the M attractor on the lower plot, with initial conditions . γ = 0.1.
Increasing the number of games per agent from 2.10 5 to 2.10 6 and to 2.10 7 increases the size of equity region: the transition in β from equity to mixed attractors is increased from . to . and to . . This is one more indication that the equity attractor is the most stable and that it would be reached if the game would continue for ever.
The stability of the multiphase attractor for larger β values β = 3 can be observed on the time plot of Figure  with  The above regime diagrams (fig. , , ) were measured following "adiabatic decay": initial conditions were applied to measure J at the highest β/γ value, i.e. the rightmost points, and the obtained equilibrium values were kept to initiate the computation of next point, and so on. Another method is to compute each point from the same initial conditions. Figure  Comparison of transition diagrams between the previously used "adiabatic decay" and those obtained with identical initial conditions noted eq, measured by the fractions of H, L and M choices. x square lattice, von Neumann neighborhood, simulation time 10 8 , γ = 0.1. Initial conditions were uniform among preference coe icients .

Appendix B: Simulation so ware
We used Netlogo (Wilensky ) version . . so ware to run single simulations which results are displayed on figures , , , , and . The program is available as https://github.com/weisbuch/inequity.f/blob/master/BMnov.nlogo. The simplex trajectories Figure

Notes
This result is rigorous only in the Mean Field approximation. The average is done at equilibrium on games against the agent's neighbours. The approximate value of P in Equation is obtained by neglecting in Equation the other two exponentials with small exponents with respect to the main term exp(β.J i ). When β = 2, γ = 0.1, π = 0.7 the ratio is 1.2 10 6 The relevant notion is the balance of a signed graph, a concept introduced by Harary ( ) and later called frustration by Toulouse ( ). Frustrated loops are unbalanced signed graphs. A loop of negatively interacting elements has no stable configuration when the number of elements is odd. This applies to H choices which are only stable when connected to L choices.
The adiabatic procedure consists in following the system equilibrium when parameters such as β and γ are slowly varied. An alternative procedure is to compute equilibrium values from the same initial conditions. Both diagrams display little di erence, see Figure in the Appendix A.
And many more, such as bilayer nets. γ is the time rate at which agents forget their past profits.
For instance a small domain of agents playing M among a much larger domain of H/L playing agents to represent revolutions.