Juliette Rouchier: Re-implementation of a multi-agent model aimed at sustaining experimental economic research

Juliette Rouchier (2003)

Re-implementation of a multi-agent model aimed at sustaining experimental economic research: The case of simulations with emerging speculation

Journal of Artificial Societies and Social Simulation vol. 6, no. 4
<https://www.jasss.org/6/4/7.html>

To cite articles published in the Journal of Artificial Societies and Social Simulation, please reference the above information and include paragraph numbers if necessary

Received: 13-Jul-2003 Accepted: 13-Jul-2003 Published: 31-Oct-2003

Abstract

The paper presents an attempt at a replication of a multi-agent model dealing with the issue of speculation. In the Journal of Economic Dynamics and Control, John Duffy presents his model and results, as a coupling between an experimental economic version and a multi-agent version, of a model by Kiyotaki and Wright (1989). This original model offers a structural setting on which to base a microeconomic view of speculation, composed of a production-exchange-consumption setting with three goods that differ by their storage costs. Here, I present my own version of the multi-agent model, which is as close as possible to John Duffy's, although I have been unable to reproduce his actual results. Most of my results are neither close to the experimental data nor the simulation data, which leads me to discuss the model of rationality of agents itself, and the way the results were described. The replication process is all the more interesting that it allows the redefinition of the indicators with which to analyze the model.
Keywords:: Experimental Economics; Learning; Model Validation; Multi-agent Simulation; Speculation

Introduction

1.1: This paper describes the re-implementation of a multi-agent model and questions the possibility of re-creating a simulated universe like the simulations already described in a scientific paper. Here, the model is one of a society where agents are induced to speculate in their exchanges. It is copied from a work by John Duffy in the Journal of Economic Dynamics and Control, who adapted an institutional setting that was first published in 1989, by Kiyotaki and Wright^[1]. Duffy's adaptation was a very interesting attempt to link experiments with real humans and simulations with artificial agents, on an economic topic. His research is part of a growing trend in experimental economics, where researchers tend to use simulations with artificial agents for several reasons: to help find out the best values of parameters for their settings (which seems to be an extremely general goal for experimentalists), to check the coherence of assumptions they make about rationality, and to help organise new designs for experiments (such as Duffy's). The relation between both techniques in economics seems to be very relevant, since both approaches display similarities. This similarity is visible in terms of issues (studying the influence of cognition on self-organised institutions or the importance of institution on the actions of the individual, both influences considered in a feedback loop) and in terms of epistemological approach (Smith 2002 ; Gilbert and Conte 1995).
1.2: This spreading use of the technique of multi-agent simulation makes it important to agree on a common framework of description for the models, simulation protocol and results. Replication of results with a re-implementation of the same model, potentially on a new simulation platform, has been identified as one of the key verification and validation steps in simulation modelling (Edmonds 2001, Edmonds and Hales 2003). The fact that this re-implementation should be possible, based solely on the data provided by the published paper, is a constraint that has been proposed by Jim Doran to assure the evaluation of the model (Doran 2001). Although quite difficult to attain, this aim seems to be reasonable enough in the early days of a research agenda (such as the social simulation one) so as to secure communication among researchers and the possibility of checking the design of simulation models.
1.3: I chose to re-implement Duffy's model for two reasons: (1) the use he makes of modelling, even if it might be quite usual in the experimental economist community, is very interesting and has rarely been described with such detail; (2) the specific model he uses, the Kiyotaki-Wright (1989) one, is a simple and clear archetypical device for describing a setting that induces speculation. Their model has already been studied by many simulators (Basci 1999), and experimental economists (Duffy and Ochs 1999) and can thus be used as a benchmark for discussion.
1.4: In this paper, I will first describe the model by Kiyotaki and Wright. Then I will turn to the multi-agent society built by Duffy and describe his aims and the model he proposes. Next I will describe the process of replication that I undertook. During this process several problems occurred which could partly be solved with the help of John Duffy himself. My conclusion focuses on the two main issues that arose: the description of the rationality of agents is too verbal and not sufficiently algorithmic, which leaves some elements ambiguous; and the indicators that are used for observation are not precise enough, so that it is difficult to understand why my model fails at replicating his results.

A model of a market that induces speculative behaviours

2.1

The environment presented here defines a production-consumption dynamics with three different goods and an exchange institution. It was designed by Kiyotaki and Wright to induce some of the agents to store a good that is not their own consumption good, and hence use one of the products as an exchange good. Their aim is to show how money emerges in a society. Deciding to exchange a good that is expensive to store but easier to exchange afterwards is what is considered here as a speculative behaviour.

A three goods model

2.2

Kiyotaki and Wright (1989) define a market in which three different types of agents perform decentralised bilateral negotiations^[2] concerning three different goods, called good 1, good 2 and good 3. The agents need to consume a unit of good to increase their utility^[3] and they produce a unit of good each time they have consumed one. Agent 1 needs good 1 and produces good 2; agent 2 consumes good 2 and produces good 3; agent 3 consumes good 3 and produces good 1. To make it easier, Kiyotaki and Wright write that agent i consumes good i and produces good i+1. As one can note: agents have to exchange when they want to consume and not all agents can be satisfied by just one exchange. Indeed, if two agents exchange their own production goods, one can be satisfied but the other would get a useless good, good i +2, which is neither its own production good nor its consumption good. The constraint of the bilateral trading creates a compulsory conservation of goods from one time-step to another. An agent who exchanges and stores i+2 is speculating, since it speculates on the gain it will have at the next time-step, when it might get its consumption good through exchange^[4].

Figure 1. A) From producer to consumer: the ideal circulation of goods (impossible to achieve on this market which is governed by bilateral trading); B) Fundamental strategies in the context of the model (neither agents 1 or 3 are interested in trading for the good i +2); C) Ideal speculative equilibrium pattern, which can be reached for certain values of costs and utility

2.3

To keep each good from one time-step to the next, there is a storage cost, c1, c2 and c3. They are not equal, hence are agents not symmetrical: 0 < c1 < c2 < c3 < u, where u is the utility gained for the consumption of one unit. Each agent can keep only one unit and those units cannot be divided. An agent will never store its own consumption good: at one time-step, if it gets its good through exchange, it consumes it and produces its production good right away, which is stored until next time-step. In order to have some risk associated with the conservation of units of a good, and to increase the decay of the value of each good, a discount factor is introduced: at the end of each time-step a number is randomly chosen and compared to this discount factor _ (0 < β < 1) to decide if the markets shuts or continues.

Diverse optimal behaviours

2.4

Obviously, since not all conservation costs are alike, the interest of exchanging goods will be different for the three types of agents.

2.5

One can note

γ(i+1) = - C(i+1) +β u,

(1.1)

the expected gain for an agent who keeps good i+1 and sells it at the next time-step ; and

γ( i+2) = - C(i+2) + β u,

(1.2)

the same gain for good i+2.

2.6

The only agents who are interested in exchanging their production good i+1 for i +2 are agents of type 2: the cost of keeping good 1 is lower than the cost of keeping their production good, 3. In this case, their fundamental strategy, with a short term perspective, is equivalent to speculation. For the others, the fundamental strategy is to keep their production good, the storage cost of which is lower. However, even agents 1 and 3 can be interested in performing speculation: at each time-step, the gain they can expect at the next time-step depends on the fact that they will meet an agent willing to trade and to give them their consumption good.

2.7

pi denotes the proportion of agents i that possess the good they produced (i+1) at a given time-step, and (1-pi) the proportion of agents i who have traded this good for good i+2. Kiyotaki and Wright describe different situations that can occur. A situation being defined by u, c1, c2, c3, β and p1, p2, p3, it is possible for agents 1 and 3 to decide if they will speculate or not. One can denote Si as the optimal strategy of an agent i at one time-step, where Si = 1 if i accepts to exchange good i+1 for good i+2 and Si = 0 if it refuses. One then writes the situation of a society as (S1, S2, S3), which gives a complete description of the strategies of the agents. In the centralised approach of Kiyotaki and Wright where all agents have the same knowledge, at a given time-step, all agents of the same type will make the same choice.

2.8

The values of storage costs and repartition of goods are thus specified by the authors to discover when agents adopt a speculative strategy or a fundamental strategy. In any case, it is never interesting for agents of type 3 to speculate, it is always interesting for agents of type 2 to do so. The only agents whose choices actually depend on the situation are agents 1 who will have no interest in playing their fundamental strategy (refusing to store good 3) if good 3 is very easy to exchange for good 1. The two situations can be read in Figure 1 which represents possible exchanges at each time-step; any other exchange being rejected by one of the agent. Figure 1B is the case where only agent 2 speculates and Figure 1C depicts when both 1 and 2 speculate. Another way of representing the strategies that are chosen is to write it as vectors: s = (1,1,0) or (0,1,0).

2.9

In this centralised approach agents possess all information about themselves and the others' situations to be able to decide on their behaviour. This assumption is the one that both the experimental approach and the multi-agent approach want to release. The interest for these two approaches is to understand which type of information can be used by agents that are independent, whose sole common knowledge is the set of rules of the system and who only learn through their interactions with the others. In deciding whether to speculate or not, agents cannot decide on an optimal behaviour; they will have to acquire information about which situation is best. In the following section I describe the experiments that allow us to check whether humans show rational behaviour in such an economic environment. The evolution of their behaviour is interpreted as learning, and some indicators are chosen to show their adaptation.

Learning through experience

3.1

The first work of Duffy on this kind of topic (Duffy and Ochs, 1999) consists of a set of controlled laboratory experiments with humans who are confronted with a market setting in which they have to make choices. Like other studies on the same topic, the environment used to explore possible behaviours of humans is the one reported in Brown (1996). The aim of this type of research is to check whether the information that is given in a distributed way is enough for individuals to choose the optimal behaviour. What is shown is that humans don't perform as well as could be expected; a large proportion exhibit a behaviour that is not in their best interest. In this experiment, it is especially difficult to observe speculative behaviours among human participants of type 1, although speculation should be the dominant strategy for both agents of type 1 and 2 (see table 1).

3.2

This result shows, first, that one has to wonder about the rationality people use: to do so, Duffy makes hypotheses about learning, translates them into cognitions for artificial agents, and tests the type of collective behaviour the interactions create. The second idea is that it should be possible to build some setting that imposes more constraints on the individuals (human or artificial), so as to improve their learning during the experiments (Duffy 2001).

3.3

The use he makes of simulation is based on the following objective: to help design new experiments where the learning abilities of agents would be reinforced. Duffy formulates hypotheses about these learning processes, deduces an algorithm, and uses it to produce artificial rational agents. He then checks the coherence of the cognitive mechanisms by performing simulations with artificial agents and evaluates the results through comparison with real data. To make sure he can compare the results with human data, he builds societies that were exactly of the same size as in human experiments and do not last for a long time. Indeed, it seems an intuitively logical thing to do to give special attention to the scale of a society where random pairing of interactions is so important.

3.4

He concludes that the performance of his artificial agents are good enough to be used as a benchmark to test new settings for the experiments. He then puts his artificial leaning agents in two new settings that are supposed to improve their tendency to speculate. The first setting is to change the number of agents of each type, so that to transform the probability of meetings and hence show agents of type 1 that speculation is indeed good for them. The other possibility is to mix learning agents of type 1 with automated agents that do not learn: they either always refuse to speculate (agents of type 3) or always accept (agents of type 2). Both settings were then used to elaborate new experiments with humans, and were shown to improve the emergence of speculation, in experiments as well as in simulations. The latter setting, where some automated agents are mixed with the learning agents (in simulations) or humans (in experiments) proved to be very efficient for the convergence of learning. This is the setting I have re-implemented in my work. In the following sub-section, I describe Duffy's setting as well as the re-implementation work I have done. In the rest of the paper, I will refer as "experiments" whenever humans are involved in the exchanges and "simulations" when only artificial agents are acting.

Experimental data

3.5

In Duffy's experiments, to re-create the Kiyotaki-Wright environment, each participant is assigned a type (either 1, 2 or 3), and given all information about the rules of utility earning, storage costs and decay value. At each time-step participants are randomly paired with another agent about whom they know nothing but the good that he or she possesses and if he or she is willing to exchange. Ten successive experiments were conducted, with a decay value of 0.9, and hence there are on average 100 exchange opportunities for each session.

3.6

Duffy is interested in the evaluation of the tendency for each type of agent to speculate. The evaluation is based on a ratio defined for a given type of agents, which we will refer to as speculative ratio:

Def 1:
SR (i,t) = Speculative ratio of agents of type i at time t ;
NAS (i,t) = Number of speculative trades that are accepted by agents of type i at time t;
NPS (i,t) = Number of possible speculative trades for agents of type i at time t.

Then SR (i,t) = NAS (i,t) / NPS (i,t)

A participant of type i is said to accept an exchange for speculation if he/she proposes to give good i+1 to get good i+2.

3.7

In the experiments, agents are expected to learn about the best possible behaviour. The comparison of the attitude over the first half and the second half of the simulation is then supposed to account for this learning. Duffy compares the average speculative ratio over the first half of the session and over the last half of the session.


Table 1: Offer frequencies over each half of 5 sessions with real agents. Results are given as the average speculative ratio for each type over the first half of the simulation and over the second half of the simulation

	Agents type 1 offers 2 for 3		Agents type 2 offers 3 for 1		Agents type 3 offers 1 for 2
	first half	second half	first half	second half	first half	second half
R1	0.13	0.18	0.98	0.97	0.29	0.29
R2	0.38	0.65	0.95	0.95	0.17	0.14
R3	0.48	0.57	0.96	1.00	0.13	0.14
R4	0.08	0.24	0.92	0.98	0.12	0.02
R5	0.06	0.32	0.93	0.97	0.25	0.18
Average on R1-R5	0.23	0.37	0.95	0.96	0.20	0.16

3.8

In the described experiments, the parameters are such that agents of type 1 should discover over time that their best strategy is to speculate, the same being true for type 2 agents. Opposite to this, agents of type 3 are induced to refuse speculation when they meet this opportunity. Therefore, we see that Duffy shows that humans who are engaged in the game are not learning so as to achieve optimal rationality. This is especially true for agents of type 1, as can be seen in table 1: agents of type 1 should increase their speculation rate over time to get to 1, and agents of type 3 should decrease their speculation rate to get to 0. To try to induce participants to speculate more and thus have the system attain the equilibrium earlier, Duffy introduces two new settings in his experiments.

3.9

First, he decides to change the number of each type of agent. In other words, some of the meetings would occur more often and hopefully help agents of type 1 learn quicker that they have to speculate. Hence, the repartition of participants would be such that 1/3 of the agents are of type 1, less than 1/3 are of type 2 and more than 1/3 of type 3: given 18 agents all together, there are 6, 4 and 8 for each type. I do not present any study of this setting.

3.10

The other option to test the ability of participants of type 1 to learn is to mix them with automated agents that always follow their fundamental strategy (agents of type 2 always accept to speculate and agents of type 3 always refuse to). Human participants are all of type 1, they are aware of who they are mixed with and they have to choose between speculating or not when they face the opportunity. Table 2 shows that the speculative attitude of the agents in this last case is more general and that the increase in speculative attitude - which is a sign of learning for agents 1 - is observable. Comparing to the other setting, Duffy's conclusion is that this second setting is the one that induces agents to speculate in the best way.


Table 2: Offer frequencies over each half of 5 sessions with real agents mixed with automated artificial agents

	Agents type 1		Agents type 2		Agents type 3
	Time-step for the first half	Time-step for the second half	Time-step for the first half	Time-step for the second half	Time-step for the first half	Time-step for the second half
R1	0.84	0.83	1.00	1.00	0.00	0.00
R2	0.52	0.53	1.00	1.00	0.00	0.00
Average on two sessions	0.69	0.71	1.00	1.00	0.00	0.00

Decentralised model

3.11

What mostly interests agent-based modellers is the building of a society with decentralised knowledge. Agents that are used in such models are autonomous, they usually have no global knowledge of their society and thus cannot calculate the optimal action to undertake. If a globally optimal situation exists, one issue that motivates the building of an artificial society is to understand what individual learning processes could lead the group to attain that equilibrium.

3.12

This point of view is quite close to the research led by experimental economists who observe the consequence of the actual circulation of information and actions they organise. In the simulations that have been led to explore the Kiyotaki-Wright environment, most researchers developed some learning algorithms and observed the apparition of speculative behaviours for the agents in societies with a large number of interacting agents (Marimon et al. 1990; Basci 1999; Staudinger 1998). To study the evolution of speculation in the group, Duffy also chooses to build agents that are not able to calculate their optimal strategy; they have no more information than their past interactions with others, not even knowing that there exists other types of agents, different rationalities or partition of goods in their environment. Moreover, he decides to stay very close to the experimental setting when he designs his experiments so that to be able to actually compare his agents' actions to humans' behaviour in the same environment. He thus limits the number of agents (maximum 24) and only performs short simulations.

3.13

Of course, artificial agents cannot be said to be aware of the setting in which they are evolving. They are given a function to calculate their expected gain, that is a logical aggregation of the data that human subjects are given (see equations 1.1 and 1.2 for the definition of expected gains, where the decay factor and the storage costs are used). Choices for agents are defined such that:

if an agent meets another agent who owns the same good as his, none of them proposes the exchange.
if an agent meets an agent who possesses its consumption good, then it proposes an exchange
in any other case, which means if the agent can trade good (i+1) for good (i+2) or the opposite, the outcome depends on its past successes in getting good i:

3.15

One defines:

ν_i+1 = Σ (I^S_i+1) * γ_{i +1} - Σ(I^F_i+1) * γ_{i +2}

(2.1)

ν_i+2 = Σ (I^S_i+2) * γ_{i +2} - Σ(I^F_i+2) * γ_{i +1}

(2.2)

where both (I_i+1) are functions which are defined on the set of time-step when the agent possessed good i+1 and:
I^S = 1 if the agent traded i+1 for i; and = 0 if it didn't.
I^F = 1 if the agent failed to trade i+1 for i; and = 0 otherwise.
With the same definition for both (I_i+2).

3.16

When the agent j, of type i faces the opportunity to speculate (hence to exchange (i+1) for (i+2)) then one defines:

x^j_i = ν^j_i+1 - ν^j_i+2

(3)

and:

P [s=0] = exp x^j_i / (1 + exp x^j_i)

(4)

is the probability for the agent to reject the exchange. Eventually:

P [s=1] = 1 - P [s=0]

(5)

is the probability for the agent to accept the exchange.

3.17

With this decision process, the agent's knowledge used to make a choice is limited to its past actions, with no knowledge of an optimal decision that could be taken. From the paper of Kiyotaki and Wright, one can choose the values of c1, c2, c3, so that we know that it should be in the interest of agents of type 1 to speculate in the long term. Since speculating cannot be their optimal decision, the use of randomness in the choice process is necessary in order that the agent might try speculation and potentially have a good feedback from this action. Indeed, the probability to speculate again will increase if the result of this action proves to be successful (i.e.: if they are not stuck with good i+2 for a too long period, unable to exchange it for good i). This algorithm is referred here as the one of "rational agents".

3.18

Another simulation protocol in Duffy's paper is when some of the agents act in an automated way. Since the system was originally design to make agents of type 1 speculate, only agents of type 2 and 3 can be what we call "automated agents", who follow the dominant strategy (which coincides with their short term fundamental strategy), and:

agents of type 2 always accept to get good 1 in exchange of good 3,
agents of type 3 always reject to get good 2 in exchange of good 1.

Reproducing the model

3.19

The main elements of the model were extremely clear in the paper, and very straightforward to reproduce^[5].

3.20

However I found two small ambiguities in the description of Duffy's implementation. For both of them, the actual choice made by John Duffy is very logical and direct, but since the choice is not mentioned in the paper, and considering my wish to reproduce his system very precisely, I couldn't make the decision myself. The resolution was quick: I asked him via email, and he answered the next day, for both questions. As will be mentioned later, there was also an issue in the interpretation of observation indicators, and again I checked with him and he was extremely diligent in his answer. This communication was very important in my understanding of the whole work performed. I haven't asked him for his actual code for two reasons: I wanted to reproduce the results with my own platform (which is for me the most important: to test the transmission of model via natural language, like in journal papers) and I wasn't sure I would be able to read his own code anyway.

3.21

The two questions I asked were:

it is not said in the paper if there is a possibility for agents to choose to exchange good i +2 for good i+1. Maybe agents who speculate would never go back to possessing the good they have produced. Indeed, Duffy tries to mimic human's behaviour, and it is quite common to note that humans rarely want to undo what they have just done. I thus asked Duffy to know if the process described in (3) and (4) is symmetrical, and the answer is yes: facing the opportunity to exchange good i+2 for i +1, the process is the same as in the other case, and agents calculate the value of

y^j_i = ν^j_i+2 - ν^j_i+1 (3')

to use probability:

P [s=0] = exp y/ (1 + exp y) (4')

Hence, for Duffy, agents have no memory of how they obtained the good.
the value of I^F (i+1) is described verbally in the paper as "the number of times the agent failed to trade i+1 for i", but it is not said what is counted as such a moment: is it "any time the agent could have traded, proposed, and was rejected" (meaning any moment when it meets another agent who possesses i but refuses to trade) or is it "any round that the agent starts with possessing i+1 and ends without possessing i"? This makes a substantial difference to the probabilities. The answer by Duffy is that agents use "the "larger set" interpretation: they count the number of periods in which they were holding i+1 but could not trade for i, regardless of which good they are matched with" (John Duffy, personal communication), and they compare this with the same result with i +2.

3.22

I designed several systems matching the above model, only changing the rationality of agents. Since I had problems in reproducing Duffy's results, I introduced two new types of learning algorithms. Both logics correspond to the above proposals. Only one change is made at a time.

The first change is that I do not allow agents to exchange good i+2 for good i+1: once they have speculated they have to keep to their choice and see the result of that action. I call these agents "stable-rational agents". In this case, agents never exchange to get the good they produce.
The second change is changing the meaning of I^f_(i+1) and using the "narrow" definition of I^f_(i+1), which I call J^s_(i+1) and J^f_(i+1). Then: J^f (i+1) = 1 if and only if an agent had the opportunity to trade i+1 for i, but the other agent rejected the offer. For me the idea was that the agent could thus infer that if it had possessed the other good, and the other agent would have accepted to trade. I call them "var-rational agents" in this case.


Table 3: The names and repartition of rationality for the agents, depending on their rationality. Agents in italic letters are the one that are used by John Duffy in his simulations

Name of the agent	Agents of type 1	Agents of type 2	Agents of type 3
Rational agents	Compare I(2) and I(3)	Compare I(3) and I(1)	Compare I(1) and I(2)
Stable rational agents	Never exchange 3 for 2	Never exchange 1 for 3	Never exchange 2 for 1
Var-rational agents	Compare J(2) and J(3)	Compare J(3) and J(1)	Compare J(1) and J(2)
Automated agents	Compare I(2) and I(3)	Always exchange 3 for 1	Never exchange 1 for 2

Simulations and results

4.1

In this section, I give my results, which differ in a significant way from Duffy's one, and show what are the missing elements that could enable to do a correct aligning of the system.

Different simulations

4.2

Duffy's simulations are run for 10 games. Each game is a succession of time-steps, during which agents are paired randomly once and decide to exchange or not. The discount factor he chooses is 0,9 (hence, on average, there are 100 time-steps of possible exchanges). At the end of one game, agents loose the opportunity to trade the good they possess at that time and have to produce their production good again. There are 3 different simulations, each with a different rationality and repartition of agents, copied from his experiments:

simulations with 8 agents of each type, all behaving with the defined rationality,
simulations with 1/3 of agents of type 1, but less than 1/3 of type 2 and more than 1/3 of type 3. He runs his tests with 18 agents all together, meaning 6, 4 and 8 (I didn't reproduce such simulations).
Simulations with 8 agents of each type, but only the agents 1 being rational, and all other agents being "automated"; i.e. agents 2 always accept to speculate, agents 3 always reject.

4.3

Duffy uses two kinds of indicators to capture the dynamics of the system:

A global indicator: the average "frequency of speculation" for each type of agent over the first half and the second half of the simulation, which means: if: Ai = number of times a speculation is possible and accepted by one of the agents of type i (be it accepted by the other or not) and: Ri = number of times a speculation is possible and rejected by one of the agents of type i (be it accepted by the other or not) then the frequency of speculation for type i is:

Fi = Ai / number of possibilities to speculate = Ai/ (Ri+Ai) (5)
A series of local indicators: the actions chosen over time by each agent i who faces the opportunity to trade Good i+1 for Good i+2, which is then represented by a series of -1 and 1, where -1 is for a rejection and 1 is for an acceptance. This enables him to make a qualitative comparison of diverse individual behaviours.

4.4

I studied only two types of simulations: with 18 agents that all are rational, with a mix of rational and automated agents. Since I didn't get the same results as Duffy's, I made two other types before getting in touch with him:

Simulations with agents are all "stable-rational" (once they start speculating with one unit of good, they don't exchange it until they can get their consumption good)
Simulations with agents are all "var-rational" (where the comparison of possessing good i+2 instead of i+1 is based on the narrow set described above).


Table 4: Simulations that were led. The series of simulations reproducing Duffy's are indicated in italic, and the ones I added are in normal format

	Rational agents	Stable rational agents	Var-rational agents
Homogenous rationality	SIM 1 Duffy: 5 simulations Me: Average and MSD over 100 simulations	SIM 2 Average and MSD over 100 simulations	SIM 3 Average and MSD over 100 simulations
Heterogenous rationality	Series of 5 simulations SIM 1 - 23 Agents of type 2 and 3 are automated -	Series of 100 simulations SIM 2 - 23 Agents of type 2 and 3 are automated -	Series of 100 simulations SIM 3 - 23 Agents of type 2 and 3 are automated -

Results and comparisons

4.5

The results of my simulations are quite different from John Duffy's, whatever the type of rationality I put in my agents. In the following table, my results are given in the form of the average values and the medium square deviation for 100 simulations in a row^[6].

4.6

I first checked step by step the simulation results to make sure that no mistakes were introduced in the setting or in the learning algorithm, but I couldn't detect any difference between what the system is supposed to do and what it actually does (which is called the verification process). I have a high confidence in the adequacy between what is expected from the code and what it actually does, although this verification process has only been carried by hand, and not with any special programming tool, which means that I can in no way exclude the possibility that there is a mistake of my part. The results are shown in the following tables.


Table 5: Duffy's results for each of 5 artificial sessions, average for these 5 sessions. The values are the average speculation ratio over the first half and the second half of the simulations

	Agents type 1		Agents type 2		Agents type 3
	first half	second half	first half	second half	first half	second half
A1	0.06	0.15	0.73	1.00	0.37	0.07
A2	0.23	0.31	0.88	0.98	0.20	0.07
A3	0.33	0.50	0.78	0.98	0.15	0.00
A4	0.18	0.42	0.81	1.00	0.17	0.00
A5	0.10	0.18	0.67	0.98	0.23	0.07
Average on A1-A5	0.19	0.32	0.77	0.99	0.22	0.04


Table 6: My results for simulations of all types with homogenous agents that are either: rational agents, var-rational agents and stable agents. Here I give the average speculation ratio of the first half and the second half of the simulations, as well as the MSD of this value for the set of simulations

		Agents type 1		Agents type 2		Agents type 3
		first half	second half	first half	second half	first half	second half
SIM 1 Rational agents	Average speculation rate	0.74	0.68	0.80	0.93	0.73	0.81
	MSD	0.03	0.10	0.08	0.09	0.01	0.11
SIM 2 var-rational agents	Average speculation rate	0.45	0.42	0.53	0.47	0.42	0.52
	MSD	0.19	0.27	0.14	0.27	0.3	0.24
SIM 3 Stable agents	Average speculation rate	0.68	0.77	0.76	0.79	0.66	0.66
	MSD	0.07	0.12	0.01	0.09	0.04	0.12

4.7

The results that I get in my simulations are all very interesting, but can unfortunately not be thought of as being similar to Duffy's. At first it was a bit of a problem since I didn't know which precise algorithm to reproduce and none of the results could help me decide of the right learning process. The second type of simulations, with var-rational agents, are to eliminate. Indeed, in that setting, none of the agents displays the right learning on average, and more importantly in terms of reproduction of results, there is a very high variability of final behaviour, depending on the simulation led. This could neither be interpreted as a "learning process" for the agents nor did it fit Duffy's results. Actually, agents 1 in that type of simulation sometimes increase speculation and sometimes decrease it over time, and this is why the MSD is so high compared to the average value.

4.8

However, in both the other settings, agents of type 3 do speculate much more than they should do if I had succeeded in reproducing Duffy's model. Duffy's agents of type 3 never speculate, and mine always get to a level of speculation that is equivalent to the one of agents of type 1 or even higher. None of the results can here be considered as good, since SIM1 see on average a decrease in learning to speculate, and SIM3 have agents 2 learn significantly less efficiently than the others.

4.9

However, as can be seen in table 7, which shows the results of 90 simulations^[7], this result is really due to the interaction among all learning agents: as soon as one makes some of these agents be automated, agents of type 1 do learn how to behave in the most efficient way. One can note that Duffy's simulation results are much closer to his experimental data than my simulation results (in raw values and in trends).


Table 7: Duffy's results and my results for simulations of all types with agents 2 and 3 being automated and agents 1 being either: rational agents, var-rational agents and stable agents. Then I took the average and MSD over the remaining simulations (over 90) of the speculation rate for each half of simulation for each type of agents

		Agents type 1		Agents type 2		Agents type 3
		first half	second half	first half	second half	first half	second half
Duffy: Average on 5 sessions	Average speculation rate	0.62	0.73	1.00	1.00	0.00	0.00
SIM 1' Rational agents	Average speculation rate	0.91	1.00	1.00	1.00	0.00	0.00
	MSD	0.04	0.01	0.00	0.00	0.00	0.00
SIM 2 ' var-rational agents	Average speculation rate	0.80	1.00	1.00	1.00	0.00	0.00
	MSD	0.15	0.05	0.00	0.00	0.00	0.00
SIM 3 ' Stable agents	Average speculation rate	0.80	0.88	1.00	1.00	0.00	0.00
	MSD	0.00	0.05	0.00	0.00	0.00	0.00

4.10

Duffy shows that individual behaviours of his agents are pretty similar to the ones of real humans, in the sense that any agents develops very stable behaviours, be it to speculate or to refuse speculation at any possible time. In my simulation, this type of behaviour tend to take place too, although I haven't been able to test it in a systematic way. Neither have I been able to compare the data obtained with real humans. A higher number of individual histories in Duffy's paper would have had enabled me to compare quantitatively in a more systematic way.

Discussion

5.1

My interest in this research was to criticize only one point of John Duffy's work. I argue that it is not possible to reproduce his simulation setting and results by just referring to his paper. This is not a criticism of his whole paper for two reasons: (1) his paper can be considered as very interesting from a methodological point of view, since it is rare to find in the literature a description of this use of a simulation, although it seems to be quite common for experimentalists; (2) Duffy uses his learning algorithm to calibrate a new set of experiments with humans, and does not pretend to have reached any good description of rationality that would make individuals speculate. It is just sufficiently accurate to suggest a change in his experiments. The only artificial agents that are then used in the experiments with real humans are those of type 2 and 3, and hence the results he claims are not contradicted by my failed attempt to get to the same simulation results.

Experiments

5.2

The comparison between experimental results and the building of artificial society is an exercise that is now spreading in economics, due to the fact that in both fields researchers are keen to identify, model and test the actual behaviour of individuals faced with various economic choices. Experimental economics research is based on several steps (Smith 1994). First, the production of a setting where individuals face an institution that mirrors theoretical settings: sometimes it can be the production of a controlled market, sometimes the production of a game-like situation ("game" understood as in "game theory"). It is always a very archetypical setting in which the role, ability to act, and communication rules for each actor are very clear, quite limited and very easy to observe. Second the organisation of experiments in which researchers isolate humans to make them play the defined game, and observe their behaviours according to the limited number of actions that can be performed. Third, the interpretation, in which the actual behaviours are compared to the one that would be predicted by economic theory, and some conclusion drawn about differences of rationality between real actors and the economical-perfectly informed agent.

5.3

Vernon Smith (Smith 2002) explains that no experiment can actually destroy a theory, but that it can be used to ask new questions, and more importantly to identify the limits of a theory's applicability to a phenomenon by spotting situations that are unexplained. This approach is very close to what most researchers using artificial agent-based worlds state: simulations are not used to create an alternative theory but to try to represent situations that do or do not fit, and hence enrich the expressiveness of description of the science at stake. Concerns of modellers and experimental economists are thus similar but not the same as those of theoretical economists, since the former two areas are more concerned with applicability than building positive results.

5.4

At the moment, there are two main uses of the co-development of simulations with artificial agents and experiments with real humans: (1) to help improve experimental settings and testing the choice of parameters, which would be too costly to do with experiments (2) to check the coherence of the assumptions that can be made on the rationality of humans, by reproducing the results with artificial agents. Duffy (2001) is dealing with the first approach, but has to produce hypotheses about learning to achieve his goal. His first experimental results stand as a benchmark to test his learning algorithm, so that it he can check that this is sufficiently good to sustain his building of a new setting. He then uses his artificial agents to build two new experimental settings in which he expects to force the human agents into more speculative behaviour.

5.5

My reason for writing this paper is that I believe that whatever the goal of simulating, anyone should be able to reproduce the results that are exposed in a paper^[8]. In that sense, in the following sub-section I give the various elements that were lacking in the description of the model and the validation process of the simulations.

Comments on the difference between my simulations and Duffy's

5.6

In the previous sections, I presented the results that Duffy exhibits to show that his algorithm is good enough to help him design new experiments and my own results obtained thanks to the replication of his model. The differences concern mainly agents of type 1 and 3 which display very contradictory behaviours in my system. There are two types of explanations for my inability to replicate its results: some are of a purely technical level, some are closer to what could be a theory of modelling. At a basic level the possible explanations are:

The algorithms I use are wrong. This is the first hypothesis: I could have made mistakes in reproducing the code. My only problem with this assumption is that I carefully followed the paper that Duffy published, and, when unsure, asked Duffy himself for his actual choices. Even more, waiting for his (quick) answer, I designed some parallel simulations "in case". After checking and counter-checking the code, and following the simulations step-by-step, I have not eliminated the idea of a mistake on my side, but if this were true, I would tend to be worried about transmission of model via papers, and strongly advocate a more general sharing of code^[9].
The system is affected by random generators. In his paper, Duffy specifies clearly that he tried several random generators to make sure that none of his results was affected and concludes that he is only observing structural results. In my work, I used a random generator that was built by a student in my team for his master's degree in computer science, and that had been proved to be a good uniform law. Random generators appear twice in the system: they are used to pair agents as well as to define their choice algorithms. However, I have not enough knowledge nor data from Duffy's work to be able to check the importance of this element in causing differences between our results.

5.7

From the point of view of a theory of model design, one can think about two elements that make it difficult to assess the good results of a learning algorithm:

The issue of size: Duffy builds short simulations with a small number of agents, so as to have a common scale when he compares simulations with human experiments^[10]. Although I know this was a positive choice in his research, I still think that the properties of the system make it impossible to support solid conclusions based on so few replications and repetitions^[11]. It is known by social modellers, although not so often published, that systems where learning is based on agent interactions are "path-dependent", depending on their history (Rouchier et al. 2001; Kaniovski at al. 2000 ). Here we are dealing with a very quick reinforcing process that is studied in a small society for a short time. One can hence assume a high dependency in the results to initial interactions, and therefore that the global results can vary in an important way, making it difficult to draw conclusions. What is clear in Duffy's paper is that individual results are qualitatively similar to the ones in experiments. However, the variability of results, in experiments that were quoted by Duffy, as well as in the simulations, and the differences between global results (since agents, even though they are regular in their behaviour, may choose the worst behaviour) are good signs that the learning process is not necessarily well represented. All one can say is that a quick reinforcement process has been implemented in the artificial agents, but there is no way to prove it is the one that humans use.
The issue of time: One fact is a bit frustrating when it comes to these considerations. The number of exchanges in the simulations is very low (no more than 1 out of 4 meetings on average), as well as the consumption rate (the maximum I ever got was 1.5 goods consumed on average at each time-step). This means that there are really few actions taking place in the system. This has two effects:
1. First I think that this kind of information has to be displayed in a paper, since it is a very important characteristic of the artificial society. It could help in the aligning process, and hence realise if the society is qualitatively equivalent.
2. The speculation rate does not have the same economic meaning in a society with very few exchanges and one which is very active. Agents have few chance to get rid of a good they don't use, have few chances to get a good they want in any case, but more generally have little chance to exchange. For humans, it seems logical to imagine that the feeling of risk is higher (or at least different) regarding speculation when they observe little activity on a market. For artificial agents, the fact that the learning is based mainly on negative feedbacks from the environment has a big influence on the way they accumulate knowledge. Here, it certainly tends to reduce the incentive to exchange, but there is no way we can evaluate the simple learning algorithm on this basis. What the Kiyotaki-Wright model uses is the probability of meeting an agent that could exchange. Looking at the simulation results, the speculative ratio should certainly be analysed in relation to the exchange ratio if one wants to see society in the same way. This element should make it more transparent for the analysis, especially for those who would like to produce a relatively good learning algorithm to mimic the learning process at stake.

5.8

Concerning the information transmitted to readers to facilitate an assessment of the learning process, and allow the accumulation of results:

The information extracted from the simulations and presented in the paper, mainly stay at a macro-level. There is no quantitative micro analysis that could sustain the comparison between two artificial systems - this would be helpful to those who wished to replicate a model.
Moreover, when Duffy does his assessment of the learning process, he never compares the step-by-step micro learning. What I mean by that is: at each time-step, an individual has acquired some information and we consider that he uses this data to choose his behaviour; an artificial agent that is to be compared with this human accumulates data over time, due to the exchange it performs; at no moment are both data identical. In this case, it is not possible to know if the process is similar, or whether, facing the same type of data, agents would never make the same decision. If one could possess data about all behaviours of all agents for one simulation, it could be interesting have each artificial agent share the "memory" of a human in experiments, and calculate what it would do in the context. Then, one might discover interesting indicators to help evaluate the learning algorithm as a result of seeing a similarity in behaviours.
It could also be interesting to perform experiments where the rationality algorithms could be evaluated at the same time as the behaviours are captured. For example, one could imagine an experiment where humans would be given an accompanying agent that is given a rational learning process. At each time-step, the agent would propose an action, and the human could choose to follow it or to do the opposite action (two actions are possible only in the system: to propose to exchange or not to propose). This evaluating process is more linked to the Artificial Intelligence than to a Distributed AI approach, but should help design a better algorithm for the agents, which is one of the issues at stake for most researchers (Marimon et al. 1990).

Conclusion

5.9

In the paper, I describe the work I have done re-implementing the Kiyotaki-Wright (1989) model of a speculative market that had been adapted to a distributed learning approach by John Duffy in the form of multi-agent simulations. My interest in this research was methodological and I have highlighted some of the elements that are missing in Duffy's paper that would have enabled me to reproduce the system. Indeed, my attempt in aligning my model with his was totally unsuccessful.

5.10

First I thought that the differences were due to an error in implementing the algorithm or building the program. After some serious study I concluded that this hypothesis could be dismissed, and that made me ask some questions concerning the generality of the cognitive processes that were built into the model, in particular that they might not be as stable as expected. It also made me wonder about the completeness of the observation protocol and its ability to actually compare local cognitive processes.

5.11

Duffy's aim is not to evaluate his learning algorithm and declare it as a valid representation of human learning. Indeed, he only needs his algorithm to organise better experiments. In this sense, the lack of methodology in the comparison could be acceptable. However, ultimately, the experimental economists themselves want to express models of cognition that the experimental results fit as well as possible. Considering that ultimate aim, it seems interesting to propose protocols that would reveal the differences, at each moment, between the action humans choose and the one artificial rational agents perform. My aim in this paper is to advocate to the establishment of common description framework and validation protocols between the experimental economic community performing simulations and the multi-agent community (also called Agent-Based Economics -ACE). For communication, the observation of behaviours has to be made at a macro level (aggregated results) but also should be made at a micro level, so as to better capture the actual learning processes of humans in economic contexts.

Acknowledgements

: I wish to thank John Duffy for his help while writing the paper, all participants of the M2M workshop for their remarks, as well as Bill McKelvey for helpful comments.

Notes

¹ And maybe to understand better the apparition of money. In this paper we will not discuss the economic interpretations that can be made of the model, but only the possibility of replication.

² Meaning that they evolve in a discrete-time environment, where they are randomly paired at each time-step and judge at that time if they want to perform an exchange with the other agent.

³ In a economic paper, it is not necessary to say that agents want to increase their utility. In his paper, Duffy explains how his experiments are organised to induce humans to get as many "utility points" as possible, and that the artificial agents are built with that innate will.

⁴ As Duffy would say: "An agent speculates when he accepts a good in trade that is more costly to store than the good he is currently storing with the expectation that this more costly-to-store good will enable him to more quickly trade for the good he desires to consume." (page 3)

⁵Just as an indication of my building of rationality, the memory of an agent is constituted of the collection of its interaction along the time, represented, for agents A meeting agents B as: [good possessed by A; good possessed by B ; proposition of exchange by A ; proposition of exchange by B ; time-step] with: (proposition of exchange) = 1 if the agent does propose the exchange (proposition of exchange) = 0 if the agent does not want the exchange

⁶ I ran 100 simulations, among which some results were not suitable because the time of simulation was too short and agents did not have the opportunity to speculate: I considered that simulations with less than 40 time-steps were not long enough to study our topic. I then calculated the average and MSD over the remaining simulations (over 90) of the speculation rate for each half of simulation for each t type of agents.

⁷ About 10% of the simulations gave results that were not suitable because the time of simulation was too short and agents did not have the opportunity to speculate.

⁸ This is what is stated in the requirement for publishing for JASSS <https://www.jasss.org/admin/submit.html >.

⁹ However, I don't want to get into a discussion on platforms, knowing that the sharing of code could mean that I have monopolistic views for one platform on other: it is not the case.

¹⁰ Experiments are short because humans cannot play for a very long time that kind of game, and with few people because each participant is paid and Research is poor.

¹¹ I don't condemn John Duffy's choice, since he had to build both experiments and simulations on that topic. His idea is really wise, considering that he had to find a straightforward way to build comparable societies on both side. The idea that the combinatorial dimension of the problem could be at stake - as well as all other ideas - only appeared to me thanks to his extremely clear work.

References

BASCI E. (1999) Learning by imitation, Journal of Economic Dynamics and Control, 23, pp 1569-1585.

BROWN P.M. (1996) Experimental evidence on money as a medium of exchange, Journal of Economic Dynamics and Control, 20, pp 583-600.

CONTE R and Gilbert N (1995) "Introduction: Computer Simulation for Social Theory", In Conte R and Gilbert N , Artificial Societies. The Computer Simulation of Social Life, UCL Press, London.

DORAN, J. (2001) Intervening to Achieve Co-operative Ecosystem Management: Towards an Agent Based Model. Journal of Artificial Societies and Social Simulation vol. 4, no. 2,

DUFFY J (2001) Learning to Speculate: Experiments with Artificial and Real Agents, JEDC, 25, pp 295-319.

DUFFY J and Ochs J (1999) Emergence of money as a medium of exchange: An experimental study, American Economic Review, 89, pp 847-877.

EDMONDS B. (2001) The Use of Models - making MABS actually work. In. Moss, S. and Davidsson, P. (eds.), Multi Agent Based Simulation, Lecture Notes in Artificial Intelligence, 1979, pp 15-32.

EDMONDS B and Hales D (2003, this issue) Replication, Replication and Replication - Some Hard Lessons from Model Alignment. Journal of Artificial Societies and Social Simulation.

KANIOVSKI Y M, Kryazhimskii A V and Young P (2000) Adaptive Dynamics in Games Played by Heterogeneous Populations, Games and Economic Behavior, 31, pp 50-96

KIYOTAKI N and Wright R (1989) On money as a medium of exchange, Journal of Political Economy, 97, pp 924-954.

MARIMON R, McGrattan E R and Sargent T J (1990) Money as a medium of exchange in an economy with artificially intelligent agents, Journal of Economic Dynamics and Control, 14, 329-373.

ROUCHIER J, Bousquet F., Requier-Desjardins M. and Antona M. (2001) A Multi-Agent Model for Describing Transhumance in North Cameroon: Comparison of Different Rationality to Develop a Routine, Journal of Economic Dynamics and Control, 25, pp 527-559.

STAUDINGER S (1998) Money as a medium of exchange: An analysis with genetic algorithms, working paper, Technical University, Vienna.

SMITH V. (1994) Economics in the Laboratory, Journal of Economic Perspectives, Vol. 8, No. 1, Winter 1994, 113-131.

SMITH V. (2002) Method in Experiment: Rhetoric and Reality, Experimental economics 5, pp 91-110.

Button Return to Contents of this issue