©Copyright JASSS

JASSS logo ----

Matteo Richiardi, Roberto Leombruni, Nicole Saam and Michele Sonnessa (2006)

A Common Protocol for Agent-Based Social Simulation

Journal of Artificial Societies and Social Simulation vol. 9, no. 1

For information about citing this article, click here

Received: 12-Dec-2005    Accepted: 15-Dec-2005    Published: 31-Jan-2006

PDF version

* Abstract

Traditional (i.e. analytical) modelling practices in the social sciences rely on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kinds of analysis that are performed. Unfortunately, computer-simulated models often lack such a reference to an accepted methodological standard. This is one of the main reasons for the scepticism among mainstream social scientists that results in low acceptance of papers with agent-based methodology in the top journals. We identify some methodological pitfalls that, according to us, are common in papers employing agent-based simulations, and propose appropriate solutions. We discuss each issue with reference to a general characterization of dynamic micro models, which encompasses both analytical and simulation models. In the way, we also clarify some confusing terminology. We then propose a three-stage process that could lead to the establishment of methodological standards in social and economic simulations.

Agent-Based, Simulations, Methodology, Calibration, Validation, Sensitivity Analysis

* Introduction

Our starting point is rather disappointing evidence: despite the upsurge in agent-based research witnessed in the past 15 years (see the reviews by Tesfatsion 2001a, 2001b,2001c and Wan et al. 2002) and despite all the expectations they have raised, agent-based simulations haven't succeeded yet in finding a place in the standard social scientist's toolbox.

Many people involved in agent-based research[1] thought they should have. It is now increasingly recognised that many systems are characterized by the fact that their aggregate properties cannot be deduced simply by looking at how each component behaves, the interaction structure itself playing a crucial role. On one hand, the traditional approach of simplifying everything may often "throw the baby out with the bath water". On the other hand, trying to specify a more detailed interaction structure or a more realistic individual behaviour, and the system easily becomes analytically intractable, or simply very difficult to manipulate algebraically. On the contrary, agent-based modelling (ABM) allows a flexible design of how the individual entities behave and interact, since the results are computed and need not be solved analytically. This comes certainly at a cost (see below), but it may be the only way to proceed with certain research questions.

However, the crude numbers tell a rather different story: for instance, among the top 20 economic journals we were able to find only 7 articles based on ABM[2], among the 26,698 articles that were published since the seminal work conducted at the Santa Fe Institute (Anderson et al. 1988).[3] Looking back in time even more we can add only 2 more papers[4], plus a stream of seminal methodological works, mainly published in the American Economic Review in the year 1960, which — interestingly enough — seem to have been almost totally forgotten[5]. If we think of agent-based models that attracted the interest of a wider audience, the list shrinks to Schelling's segregation models, where the simulation is worked out on a sheet of paper, and to the El Farol bar problem by Arthur, which led to a whole stream of literature on minority games. Overall, we should then conclude that agent-based modelling counts for less than 0.03% of the top economic research. It seems to be confined only to specialized journals like the Journal of Economic Dynamics and Control[6], ranking 23rd, the Journal of Artificial Societies and Social Simulation, and Computational Economics, both which are not ranked. A notable exception is the Journal of Economic Behavior and Organization, ranked 32nd, which sometimes publishes research in ABM.

Among the top 10 sociological journals we were able to find only 11 articles based on ABM.[7] They have been published in four journals: The American Sociological Review (ranking 1st; 4 articles), the American Journal of Sociology (ranking 2nd; 5 articles), the Annual Review of Sociology (ranking 4th, 1 article) and Sociological Methodology (ranking 10th, 1 article).

Agent-based models have solid methodological foundations[8]. However, the greater freedom they have granted to researchers (in terms of model design) has often degenerated in a sort of anarchy (in terms of design, analysis and presentation). For instance, there is no clear classification of the different ways in which agents can exchange and communicate: every model proposes its own interaction structure. Also, there is not a standard way to treat the artificial data stemming from the simulation runs, in order to provide a description of the dynamics of the system, and many articles seem to ignore the basics of experimental design. Often, the comparison between artificial and real data is overly naïf, and the parameters' values are chosen without proper discussion. Finally, too often it is not possible to understand the details of the implementation of an agent-based simulation. This makes replication a difficult, sometimes impossible task, thus violating the basic principle of scientific practice and confining the knowledge generated by agent-based simulations to no more than anecdotal evidence.

This has to be contrasted with traditional analytical modelling, which relies on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kind of analysis that are performed.

Think for example about the organization of most papers. There is generally a detailed reference to the literature; the model often adopts an existing framework and extends, or departs from, well-known models only in limited respects. This allows a concise description, and saves more space for the results, which are finally confronted with the empirical data. When estimation is involved measures of the validity and reliability of the estimates are always presented, in a very standardized way.

Of course one reason for the lack of a standard protocol for agent-based research is the relatively young age of the methodology. Leave it by its own, one could say, and a best practice will spontaneously emerge. However, some discussion on the desirability of such a standard and on its characteristics may help. The example of the Cowles Commission suggests that this is indeed a promising direction. The Commission was founded in 1932 by the businessman and economist Alfred Cowles in Colorado Springs, moved first to Chicago in 1939 and finally to Yale in 1955, where it became established as the Cowles Foundation. As its motto ("Science is Measurement") indicates, the Cowles Commission was dedicated to the pursuit of linking economic theory to mathematics and statistics. Its main contributions to economics lie in its "creation" and consolidation of two important fields - general equilibrium theory and econometrics. The Commission focused its attention on some particular problems, namely the estimation of large, simultaneous equation models, with a strong concern for identification and hypothesis testing. Its prestige and influence set the priorities for theoretical developments elsewhere too, and its recommendations are generally followed today in economics (Klevorick 1983).

The objective of this paper is obviously less ambitious. We simply identify the need for a common protocol for agent-based simulations. We discuss some methodological pitfalls that are common in papers employing agent-based simulations, distinguishing between four different issues: link with the literature (section 2), structure of the models (section 3), analysis (section 4) and replicability (section 5). We then propose a three-stage process that could lead to the establishment of methodological standards in social and economic simulation (section 6).

* Links with the literature

As we have seen, the advantage of agent-based simulations over more traditional approaches lies in the flexibility they allow in model specification. Of course more freedom means more heterogeneity. While analytical models generally build on the work of their predecessors, agent-based simulations often depart radically from the existing literature. This is a problem in two respects. First, more space is needed to explain the model structure: since the overall length of a published paper in social science journals cannot generally exceed 25 to 30 pages, this implies that less space is available for discussing the results. Considering that the description of the model dynamics and the estimation procedure also requires more space than in traditional analytical models (see Leombruni and Richiardi 2005), this results in papers that are often either too dense or too long.

The second problem is that in departing from the existing literature, the model results become more difficult to assess.

Our position is simple: each article should include references to the theoretical background of the social or economic phenomenon that is investigated. A new model should always refer to the models, if any, with respect to which it is innovating. This holds for incremental and (even more) for radical innovations. All variations should be motivated, either in isolation or jointly. Moreover, since birthrights matter, reference should be made not only to previous agent-based models, if any, but also to the relevant non-simulation literature. After all, the mainstream is not computational, and we have to talk with the mainstream.

* Structure of the model

There are some basic features that characterize a simulation model. Some are technical: above all, the treatment of time (discrete or continuous[9]) and the treatment of fate (stochastic or deterministic), the representation of space (topology), the population evolution (birth and death processes). Some are less technical: the treatment of heterogeneity (which variables differ across individuals and how), the interaction structure (localized or non-localized), the coordination structure (centralized, decentralized[10]), the type of individual behaviour (optimising, satisficing, etc.).

Too often the reader of a paper using agent-based simulations has to work all these properties out himself. On the contrary, in more traditional papers models are often immediately classified as based on "overlapping generations of intertemporally optimising individuals", "2-person Bayesian game with asymmetric information" … We believe that having all the main features of a simulation model clearly and immediately stated would greatly increase the understanding of simulation-based models, and facilitate the comparison of alternative specifications.

* Analysis

Once a model has been specified the issue of analysing its behaviour arises. To this regard, simulation models differ in a radical sense from traditional analytical models. Simulations suffer from the problem of stating general propositions about the dynamics of the model starting only from point observations.[11] The point is that, although simulations do consist of a well-defined set of functions that unambiguously define the macro dynamics of the system, they do not offer a compact set of equations - together with their inevitable algebraic solution (Leombruni and Richiardi 2005).

Think of the following general characterization of dynamic micro models. Assume that at each time t an individual i, i &isin 1 … n , is well described by a state variable xijRk, and let the evolution of her state variable be specified by the difference equation:

xi,t+1 = fi(xi,t, x-i,t, &alphai) (1)

where x-i is the state of all individuals other than i and α are some structural parameters.


Now, an important decision has to be made concerning the objective itself of the analysis. Generally, we are interested in some statistics Y defined over the entire population[12]:

Yt = s(x1,t …, xn,t) (2)

Of course, there may be (possibly infinitely) many aggregate statistics to look at. Traditional analytical models are generally constrained in their choice of which statistics to look at by analytical tractability. Agent-based simulations are not. Thus, as a general rule full exploration should be performed. Full exploration means that the behaviour of all meaningful individual and aggregate variables is explored, with reference to the results currently available in the literature. For instance, in a model of labour participation, if firm production is defined, aggregate production (business cycles, etc.) should also be investigated. However, in many cases full exploration is not particularly meaningful. This may happen when some parts of the model (e.g. the demand side for firms' output in a model of labour participation) are only sketched. The model is then investigated only with respect to a subset of all defined variables. When such a partial exploration is performed, this should be clearly stated, and the motivations explained.

Regardless of the specification for fi, we can always solve equation (2) by iteratively substituting each term xi,t using (1):

Yi = gt(x1,0, …, xn,0; α1, … &alphan) (3)

The law of motion (3) uniquely relates the value of Y at any time t to the initial conditions of the system and to the values of the parameters[13]. Traditional models generally assume very simple functional forms for fi, in order to have analytically tractable expressions for gt. This function, which is also known as an input-output transformation function, can then be investigated by computing derivatives, etc., and its parameters estimated in the real data. On the other hand, in agent-based simulations gt easily grows enormous, hindering any attempt at algebraic manipulation. In order to reconstruct it and explain the behaviour of the simulation model we must then rely on the analysis of the artificial data coming out from many different simulation runs, with different values of the parameters.


Before turning to the data another decision has to be made, and clearly stated: whether the analysis of the model is performed in equilibrium, out-of-equilibrium, or both. In this regard, a clarification on the notion itself of equilibrium is also needed. Since in every micro-model (no matter whether simulated or analytically solved) both the individual and the aggregate scale are defined, two broad definitions can in fact be used. One is a definition of equilibrium at a micro-level, as a state where individual strategies are constant.[14] The other is a definition of equilibrium at a macro-level, as a state where some relevant (aggregate) statistics of the system are stationary.

Note that we can have equilibrium at the micro-level but disequilibrium at the macro-level (think for instance of population growth in developing countries, or of periods of financial instability), or the opposite (e.g. stable evolutionary models).

Contrary to traditional microeconomic models, sociological theories and agent-based simulations generally refer to the second definition. In ABM individual behaviour is generally less sophisticated, and expectations are sometimes not even defined. Thus, the invariance of some aggregate measure is preferred as a definition of equilibrium.

Both cases can be expressed as a convergence of (3) to a function not dependent on t[15]:

Equation (4)

Traditional analytical models often impose equilibrium conditions from the onset, assuming that they are always met. Equation (4) is then valid right from the start: the system jumps to the equilibrium. This leads to a backward logical situation, since we need to assume the answer to the problem (which equilibrium the economy will reach) in order to analyse the problem itself (what path will the economy follow from its initial endowment to equilibrium). On the other hand, in social and economic agent-based simulations, as in much of evolutionary economics, the focus of the interest is on whether an equilibrium will eventually emerge, i.e. be selected by the dynamics of the system.

These different definitions and methods of analysis may confuse the non-practitioner. Great attention should then be paid to clearly define which equilibrium concept has been used, and the strategy adopted to identify the equilibria (e.g. evolutionary selection).


The function g expresses the behaviour of the model with respect to the variable Y we are interested in. As we have seen, in an agent-based simulation it remains unknown. However, some intuition on its shape can be gained by running many simulations with different parameters, and analysing their relationship with the outcome of interest. There are two scales on which such an exercise can be done: a global level and a local level. In a global investigation, we are interested in how the model behaves in broad regions of the parameters' space, i.e. for general values of the initial conditions and the parameters. This is generally the case when the model is built with a theoretical perspective: the relationship between inputs and outputs has to be understood per se, without reference to the real data. On the other hand, in a local investigation we are interested in the model only in restricted regions of the parameters' space. This is generally the case when the model is built with an empirical goal: we want to replicate some empirical phenomenon of interest and thus we want to explore the dynamics of our model only around the estimated values of the parameters.

A global investigation is generally done by letting all parameters and initial conditions vary (in a random or systematic way), and then imposing a metamodel

Equation (5)

on the artificial data, where β are some coefficients to be estimated in the artificial data. Note that this is nothing else than a sensitivity analysis on all the parameters together.

Of course, the final choice of a particular specification for the metamodel remains to a certain extent arbitrary. However, there are methodologies that help when solving this (meta)model selection problem (see Hendry and Krolzig 2001). Moreover, as long as two different specifications provide the same description of the dynamics of the model in the relevant range of the parameters and the exogenous variables, we should not bother too much about which one is closest to the 'true' form gt.[16]

A local investigation around given values of the parameters can also be done by keeping all the parameters constant but one, which is varied. A graphical (bivariate) description of the dependency of Yt on that parameter is often reported, without recurring to a metamodel (see the section on sensitivity analysis below). The crucial point for a local investigation is of course the choice of the values of the parameters. An obvious option is to choose the values for which the behaviour of the simulated system is as close to the behaviour of the real system as possible, i.e. their estimates in the real data.

Finally, statistical testing of the properties found in the artificial data should always be performed. For instance, the assertion that the model has reached a stationary state (macro-equilibrium) Ye, for given inputs (x0, α), must be tested for stationarity or, better, ergodicity. [17]

Estimation / Calibration

Parameter estimation can be preliminary to a local investigation (around the estimates), or can follow the global investigation of the behaviour of the simulated system. Here, we refer to estimation as the process of choosing the values of the parameters that maximise the accordance of the model's behaviour (somehow measured) with the real-world system. We thus do not distinguish between estimation and calibration. Of course there are relevant examples in the literature[18] where the two terms are given (slightly) different meanings (see for instance Kydland and Prescott 1996). However, we agree with Hansen and Heckman (1996 p.91) that
the distinction drawn between calibrating and estimating the parameters of a model is artificial at best. Moreover, the justification for what is called "calibration" is vague and confusing. In a profession that is already too segmented, the construction of such artificial distinctions is counterproductive.

While invocating a convergence towards the adoption of the term "estimation", which seems best suited to foster the dialogue between agent-based simulation practitioners and econometricians, with respect to this point we advance only a weak methodological recommendation: to carefully define any terminology used.

Of course not all parameters deserve the same treatment. Some of them have very natural real counterparts, and thus their value is known: we know the concepts which these parameters represent. The concepts are operationalized. It is possible to collect empirical data on the indicators which operationalize the concepts. E.g., the preferences of parties who participate in negotiations may be measured by using questionnaires and document analysis. With respect to these parameters, the simulation is run with empirical data. Unknown parameters require a different treatment. The fact that the function gt is not known implies that it is not possible use it directly for estimating the values of the parameters. But structural estimation is still possible via simulation-based estimation techniques (Gourieroux and Monfort 1997; Mariano et al. 2000; Train 2003). For instance, we can maximise an approximation of the likelihood instead of the likelihood (Maximum Simulated Likelihood). The same principle can be applied to the (generalised) method of moments estimation, which can be replaced by simulated approximations (Method of Simulated Moments): one simply needs to generate simulated data according to the model and choose parameters that make moments of this simulated data as close as possible to the moments of the true data. A special case of this is the Method of Simulated Scores, where the moments are based on the first order conditions of maximum likelihood. Finally, the method of Indirect Inference uses a simplified auxiliary model, and produces parameter estimates such that the estimates of the auxiliary model based upon the real data are as close as possible to those based upon simulated data from the original model. Clearly, a natural choice for the auxiliary model is our metamodel (4).

It is important to stress that the estimation stage is often missing in agent-based models. When the issue of parameters choice is considered, most agent-based simulations offer a rough calibration "by hand". This adds to the feeling of fuzziness that many non-practitioners have, when confronting with the methodology. Conversely, we believe that rigorous estimation procedures should be used, and all relevant references provided.

Sensitivity Analysis

Sensitivity analysis does not only refer to the problem of sampling the parameters space, already described when we talk about global and local investigation of the behaviour of the model. The term "sensitivity analysis" is generally used to describe a family of methods for altering the input values of the model in various ways. Such analyses are included in the validation step of almost all technical simulations (see Law and Kelton 1991, pp. 310ff). In the natural sciences and engineering, sensitivity analysis is thus a standard method for verifying simulation models. The three major purposes of sensitivity analysis are corroborating the central results of the simulation, revealing possible variations in the results and guiding future research by highlighting the most important processes for further investigation.

A short review of simulation textbooks and other studies reveals that the term is currently used as a general catch all for diverse techniques: there is no precise definition and no special methodology currently associated with this term. We define sensitivity analysis as a collection of tools and methods used for investigating how sensitive the output values of a model are to changes in the input values (see Chattoe et a. 2000). A "good" simulation model (or a "significant" result) is believed to occur when the output values of interest remain within an interval (which has to be defined ), despite "significant" changes in the input values (which also have to be defined). The development of a typology of sensitivity analyses involves a more detailed consideration of the status of "input" and "output" along with a range of possible measures of change or stability (lack of change). The following kinds of deliberate input variability can all be seen as commonly used examples of sensitivity analysis:

Most social and economic simulators still omit any form of sensitivity analysis. There is also a definite lack of methodological literature on sensitivity analysis in the social sciences (but see Kleijnen 1992, Kleijnen 1995a, 1995b and a few general methodological texts on sensitivity analysis: Deif 1986, Fiacco 1983, Fiacco 1984, Köhler 1996 and Ríos Insua 1990).

Our position is: the central results of a simulation model should be corroborated, possible variations in the results should be revealed and future research should be guided by highlighting the most important processes for further investigation. After all, only robust results are important and will be of interest to the mainstream. And, highlighting the most important processes for further investigation helps - especially, but not only - non-simulation colleagues in coping with complex simulation models.


Even an erroneous model can be estimated. For that reason, any model has to be validated. The term "validity" can be formally defined as the degree of homomorphism between one system and a second system that it purportedly represents (Vandierendonck 1975). [19]

Stanislaw (Stanislaw 1986) has developed a framework for understanding the concept of validity and how it applies to simulation research. He considers:

For assessing the overall validity of the simulator all three validities have to be considered. However, from an empirical science perspective this definition should also keep in mind that the real-world system is not just given by the theory. Empirical sciences, like sociology and economics, have elaborated validity concepts for Traditional (i.e., not formalized) empirical sociological research has to consider theory validity, operational validity, and empirical validity. Traditional economic research additionally considers model validity. Simulation studies which are theory-based and data-based will have to consider all five types of validity.

A short review of simulation textbooks and other studies reveals that the term validation is currently used as a general catch all for diverse techniques: there is no precise definition and no special methodology currently associated with this term. Established tests for validation are the Turing test, the test of face validity, and the test of event validity. Each test is suited to measure a particular type of validity (or combination of validities). Sterman (1984: 52) has suggested heuristic questions rather than tests for validation. These questions are interpreted as tests that aid the diagnosis of errors and assist in the confidence-building process in the model. The confidence stems from an appreciation of the structure of the model, its general behaviour characteristics and its ability to generate accepted responses to set policy changes. In the following we present some of his questions. Heuristic questions that address the validity of model structure are:

Heuristic questions that address validity of model behaviour are:

This list of question is not complete. In particular, since validation of simulation models also requires testing the program's validity, in addition to the other measures of validity necessary for traditional analytical models, further questions might be:

Only once a model has been thoroughly validated we can be confident enough to trust possibly surprising behaviours, which may point to the existence of a previously unrecognised mode of behaviour in the real system. However, most social and economic simulation studies still omit any test of validity. There is also a lack of methodological literature on validity in simulation (but see Van Dijkum et al. 1999).

Our position is: the results of a simulation model should be validated. Although there are different types of validity, each scientist knows which type of validity he/she claims for his/her model. Therefore, each simulation study should include an appropriate test of the type of validity that the scientist claims for his/her model. Moreover, validation may be seen as a social process (Sterman 1984: 51), not just as a methodological one. Therefore, a crucial element in validation is the replicability of a simulation model. We turn to this issue in the following section.

* Replicability

Many aspects of simulation models contribute to determine their degree of replicability: among them are programming language, tools, representation formalisms, development methodologies.

Since agent based models are expressed through computer programs, the first requirement is their open source license distribution. But of course an effective documentation as well as the choice of a standard tool makes the difference between a "black box" and a well-documented agent based simulation. Model documentation should separate implementations technicalities from the conceptual description, since simulations are always a mix of conceptual model and technical choices that depend on the computer architecture and the operating system.

It's been a long time since computer scientists faced the problem of defining a formalism in order to document in a very general way any software implementation. Of course, to become useful such a formalism has also to be adopted as a standard. A promising approach has been introduced with UML. The Unified Modelling Language (UML), developed by the Object Management Group[21], is an attempt to create a formalism, independent from development methodology, that can be used to represent both the static application structure of a software implementation and different aspects of its dynamic behaviour. To use an official definition (OMG 2003), "[t]he Unified Modelling Language (UML) is a language for specifying, visualizing, constructing, and documenting the artifacts of software systems".

Even if UML is closely oriented to software design, it is generic enough to be adapted to describe any algorithmic and object-oriented artefact, like ABM. The principle of UML design is that computer programs cannot be represented with one formalism only. Not only the source code, but also graphical diagrams are necessary to give a reader the key to understand, replicate and modify a program. The OMG has defined many standard diagrams. Some of the most relevant are:

Much effort has been spent on trying to define a subset of UML, specifically suitable to represent multi-agent systems (Bauer et al. 2000; Bauer et al. 2001; Huget 2002). Even if all documents are potentially useful to improve model unambiguousness, we propose the consistent use of at least two views: a static representation, with a Class diagram, and a dynamic view, showing the sequence of events that characterizes the simulation experiment.

Class diagrams can be used for the definition of model organization, with particular interest in its static aspects and the association relationships among entities. Agents are represented by classes, their characteristics by attributes, their capabilities by methods.

Figure 1. An example of a Class diagram

In particular Class diagrams can be used to show three types of relationships:

Figure 1 shows how classes of agents are associated and which attributes and operations each agent is characterized by. The full reference to the symbols used in the diagram can be found in Si Alhir (2003).

But a static view of the system is not enough to fully document a simulation model: a dynamic view has to be introduced. For a discrete event simulation the Sequence diagram looks best suited to show how events affect the objects during the experiment execution. However, in order to achieve an effective dynamic representation we propose a custom utilization of this diagram.

The Time-Sequence diagram (Sonnessa 2004) extends the UML Sequence diagram by showing on the left-hand side a special actor[22]: time. From the time line some single, cyclic or grouped events may be generated. The arrows show the chain of calls originating from any event. As shown in figure 2, the arrow connecting time and the object receiving the event notification is labelled with the @ symbol. It is used to specify when the event is raised and the name of the event. In the case of looped events, the @t..r notation is used, where t is the instant the event is raised for the first time and r is the loop frequency.

Figure 2. An example of a Time-Sequence diagram

Besides stressing the importance of source code availability, we are convinced that the choice of a standard tool, rather that the use a general-purpose programming language) could facilitate the diffusion and the replicability of agent based models. In the development of ABM tools two different approaches are emerging. The Starlogo/Netlogo (Resnick 1994) experience is based on the idea of an ABM specific language, while the Swarm library (Minar et al. 1996) and some of its followers (JAS, RePast[23]) represent a protocol in the design process, implemented in standard programming languages (Java, C, etc.). These platforms also provide a set of tools, organized in libraries, with the aim of hiding and sharing common technical issues.

Our opinion is that both approaches are superior to building models from scratch every time using custom development approaches and putting together heterogeneous libraries and toolkits.

* Strategy

In order to advance from simple methodological recommendations to the development of a widely recognized common protocol, we suggest a three phases process:

First step: Creation of a working group and development of a questionnaire.

We propose that a working group composed by representatives from scientific journals and professional associations (e.g. the European Social Simulation Association) is created. A questionnaire should then be developed by the working group in order to collect data on simulation approaches as well as the model structures, methods of optimisation, estimation, validation etc. of each newly published simulation model. This questionnaire should include a mixture of standardized and non-standardized questions. Standardized questions will help in categorizing newly published simulation models. Non-standardized questions will help in collecting all sorts of data on the methods applied (e.g., the type of validity that a paper claims, the method(s) applied for testing the model's validity, a reference for each method). We have created a draft for the proposed questionnaire in the Appendix.

Second step: The questionnaire is distributed by professional simulation journals.

Professional simulation journals in sociology and economics (JASSS, Computational Economics, etc.) will be asked to send the questionnaire to each author who submits a simulation model for publication. Each author will be requested to fill in the questionnaire. However, his/her answers will have no effect on the paper being published.

Third step: The working group analyses the data and recommends a voluntary initial methodological standard for agent-based simulations.

The working group analyses the data and recommends a voluntary initial methodological standard for agent-based simulations, defining a minimum of methodological rigour for each type of simulation model. The standard may define sub-standards that depend on the type of simulation model. Finally, the standard will be published, together with a list of references for each recommendation. Professional simulation journals in sociology and economics may adopt the standard and send to their referees a checklist in order to facilitate the evaluation of newly submitted manuscripts.

* Conclusion

In this paper we argue that agent-based modelling in the social sciences needs a more widely shared common methodological protocol. Traditional analytical modelling practices rely on very well established, although implicit, methodological standards, both with respect to the way the models are presented and to the kind of analyses that are performed. These standards are useful because (1) they contribute to the creation of a common language among scientists, (2) they can be referred to without detailed discussion, (3) they force model homogeneity and hence comparability, (4) they increase methodological awareness and guide individual scientists towards better quality research.

Unfortunately, computer-simulated models often lack such a reference to accepted methodological standards. This is one of the main reasons for the scepticism among mainstream social scientists that results in the low acceptance of papers with agent-based methodology in the top journals. We identified some methodological pitfalls that, according to us, are common in papers employing agent-based simulations. They relate to the following problematic areas: links with the literature, description of the model structure, identification of the dimensions along which the model behaviour is investigated, definition of equilibrium, interpretation of the model behaviour, estimation of the parameters, sensitivity analysis, validation, description of the computer implementation of the model and replicability of the results.

Although for each issue we discussed the different options available and identified what we consider to be the best practices, we did not intend to propose such a methodological protocol ourselves. Rather, we proposed a three-stage process that could lead to the establishment of methodological standards in social and economic simulations. This process should start from the creation of a working group of representatives from scientific journals and professional associations (e.g. the European Social Simulation Association). This working group should develop a questionnaire (for which we propose a draft copy ) that would be distributed by professional simulation journals to their authors. The working group should then analyse the results and publish a list of methodological recommendations, i.e. a protocol.

* Appendix: Draft Questionnaire (not active)

The objective of this questionnaire is the establishment of methodological standards in social and economic simulation. Traditional analytical modelling practice in the social sciences rely on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kind of analysis that are performed. Unfortunately, computer-simulated models often lack such a reference to an accepted methodological standard. This is a main reason for the scepticism among mainstream social scientists that results in the low acceptance of papers with agent-based methodology in the top journals. It is the goal of this initiative to increase the rate of acceptance of papers with agent-based methodology in the top journals.

Please respond to the following questions in order to help us to increase the methodological rigour in agent-based social and economic simulation. The first part of the questionnaire should be regarded as a sort of checklist of all the features we think are relevant in an agent-based model. Please add some notes if you think more information would be useful. The second part of the questionnaire requests more details on some specific issues.

1. Links with the literature

The model is based on some existing model in simulation literature

The model is based on some existing model in non-simulation literature

Does the paper contain a survey on the theoretical background of the phenomenon that is investigated?
Long Brief None

Does the paper contain a survey of the relevant simulation and non-simulation models?
Long Brief None

2. Structure of the model The following points have been clarified:
the goal of the model (empirical, theoretical or both)
whether the implications are testable with real data
the evolution of the population (static or dynamic)
      …if static: the total number of agents
      …if dynamic: birth and death mechanisms
the treatment of time (discrete[24] or continuous[25])
the treatment of fate (deterministic or stochastic)

The model has been classified with respect to:
the topological space (no space, nD lattices, graphs …)
the type of agent behaviour (optimising, satisficing …)
the interaction structure (localized or non-localized)
the coordination structure (centralized[26] or decentralized[27])
how expectations are formed (rational, adaptive or other)
learning (no learning, individual learning, social learning)

3. Analysis The following points have been clarified:
the objective of the analysis (full exploration[28] or partial exploration[29])
the focus of the analysis (equilibrium at micro-level[30], equilibrium at macro-level[31], out-of-equilibrium)

The following analyses have been performed:
statistical tests of the properties found in the artificial data
sensitivity analysis of the results
estimation / calibration of the parameters on real data
validation of the model

4. Replicability
The presentation is detailed enough to allow the replication of the experiment/results Yes No

A simulation platform has been used to implement the model? Yes No

Can the simulation be run online? Yes No

Graphical presentation of the model structure:
UML diagrams (specify)
Other diagrams (specify)

Code availability: Website Upon request None

5. Additional details (for authors only)

Please add some details concerning the following specific issues:

  1. If the exploration of the model is performed only on a subset of the parameters' space, please state why:
  2. Please mark the statistical analysis performed on the artificial data:
    descriptive statistics
    multivariate analysis (metamodelling)
    stationarity / ergodicity tests on artificial time series
    other (please specify)
    Please list the statistical methods used.
  3. Please list all meaningful parameters that had to be initialized and indicate the method(s) used for estimation or calibration. (Please indicate a reference for each method)
  4. Please mark the features tested for sensitivity.
    Random seed variation
    Variation in the level of data aggregation
    Noise type and noise level variation
    Variation in the decision processes and capabilities of the agents
    Parameter variation
    Variation of sample size (esp. small sample properties)
    Temporal model variation (discrete to continuous time or from fixed to random updating of cells)
  5. Please indicate the method(s) applied for testing the model's sensitivity on input variation (please give a reference for each method).
  6. Please state the type of validity that you claim for your model.
  7. Please indicate the method(s) you applied for testing the model's validity (please give a reference for each method).

Comments on this questionnaire You have completed this questionnaire whose aim is to increase the methodological rigour in agent-based social and economic simulation. Do you have any comments or recommendations for us to improve this questionnaire?

Thanks a lot for participating

* Notes

1and also a limited number of non-practitioners (see for instance Freeman 1998)

2(Arifovic 1995; Arifovic 1996; Andreoni 1995; Arthur 1991; Arthur 1994; Gode and Sunder 1993; Weisbuch 2000)

3We looked for journal articles containing the words "agent-based", "multi-agent", "computer simulation", "computer experiment", "microsimulation", "genetic algorithm", "complex systems", "El Farol", "evolutionary prisoner's dilemma", "prisoner's dilemma AND simulation" and variations in their title, keywords or abstract in the EconLit database, the American Economic Association electronic bibliography of world economics literature. Note however that EconLit sometimes does not report keywords and abstracts. We have thus integrated the resulting list with the references cited in the review articles cited above. The ranking is provided in Kalaitzidakis et al. (2003).

4Schelling 1969; Tullock and Campbell 1970.

5Clarkson and Simon 1960; Cohen 1960; Cohen and Cyert 1961; Orcutt 1960; Shubik 1960 .

. 6JEDC has a section devoted to computational methods in economics and finance.

7We looked for journal articles containing the words "simulation", "agent-based", "multi-agent" and variations in their title, keywords or abstract in the Sociological Abstracts database. All abstracts have been checked for subject matter dealing with ABM. We used the 2001 Citation Impact Factors (CIF) ranking for Sociology journals (93 journals).

8for a brief account of the analogies and differences between agent-based simulations and traditional analytical modelling see Leombruni and Richiardi (2005)

9There is some confusion in the literature to this regard, and it should be an aim of the methodological clarification we are calling for to address it. For discrete-time simulation social scientists generally mean that the state of the system is updated (i.e. observed) only at discrete (generally constant) time intervals. No reference is made to the timing of events within a period - see, for example, Allison and Leinhardt (1982). Conversely, a model is said to be continuous-time event-driven when the state of the system is updated every time a new event occurs (Lancaster 1990; Lawless 1982). In this case it is necessary to isolate all the events and define their exact timing. Note that discrete-time simulation is a natural option when continuous, flow variables are modelled, and the definition of an event becomes more arbitrary. For this reason (and mainly in the Computer Science literature) the definitions above are sometimes reversed.

10Examples of centralized coordination mechanisms other than the usual, unrealistic Walrasian auctioneer (the hypothetical market-maker who matches supply and demand to get a single price for a good) generally assumed by traditional analytical models include real auctions, stock exchange books, etc. Examples of decentralized coordination mechanisms include bargaining, barter, etc.

11Note that this is not equivalent to saying that simulations are an inductive way of doing science: induction comes at the moment of explaining the behaviour of the model (Axelrod 1997). Epstein qualifies the agent-based simulation approach as 'generative' (Epstein 1999), while the logic behind it refers to abduction (Leombruni 2002).

12These statistics can either be a macro aggregate, or a micro indicator, as in the case of individual strategies. In both cases, as a general rule all individual actions, which in turn depend on individual states, matter.

13Sometimes we are interested in the relationship between different (aggregate) statistics: e.g. the unemployment rate and the inflation rate in a model with individuals searching on the job market and firms setting prices. The analysis proposed here is still valid however: once the dynamics of each statistics is known over time, the relationship between them is univocally determined.

14This definition applies both to the traditional homo sociologicus and the traditional homo oeconomicus. In the first paradigm individuals follow social norms and hence never change their behaviour. In the latter, individuals with rational expectations maximize their utility.

15Or even not dependent on the initial conditions

16Here, the distinction between in-sample and out-of-sample values, and the objection that two formulations may fit equally well the first, but not the latter, is not meaningful. Any value in the relevant range can be included in the artificial experiments.

17Ergodicity means that a time average is indeed representative of the full ensemble. So, if the system is ergodic, each simulation run gives a good description of the overall behavior of the system.

18For an overview on the discussion see Dawkins et al. 2001, pp. 3661ff.

19Homomorphism is used as the criterion for validity rather than isomorphism, because the goal of abstraction is to map an n-dimensional system onto an m-dimensional system, where m < n. If m and n are equal, the systems are isomorphic.

20For a discussion on the confusion that surrounds the basic definition of validity, see Bailey (1988).

21The Object Management Group (OMG) is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Among its members are the leading companies in the computer industry (see http://www.omg.org).

22For an agent based modeller the concept of an actor may create some confusion. According to the UML symbolism, each object or class defined within the software architecture is represented by squared boxes (the class notation), while each external element (like human operators, hardware equipment) interacting with the software is represented by a stylized human symbol (the actor).

23JAS ( http://jaslibrary.sourceforge.net); RePast ( http://repast.sourceforge.net)

24The state of the system is updated (i.e. observed) only at discrete (generally constant) time intervals. No reference is made to the timing of events within a period.

25The state of the system is updated every time a new event occurs. All events are isolated and their exact timing defined.

26auction, book, etc.

27bargaining, etc.

28The behaviour of all meaningful individual and aggregate variables is explored, with reference to the results currently available in the literature. For instance, in a model of labour participation, if firm production is defined, aggregate production (business cycles, etc.) is also investigated.

29The model is investigated only with respect to the behaviour of some variables of interest

30defined as a state where individual strategies do not change anymore.

31defined as a state where some relevant (aggregate) statistics of the system becomes stationary.

* References

ALLISON, P., Leinhardt, S. (ed.) (1982) Discrete time methods for the analysis of event histories, Jossey-Bass, pp. 61-98.

ANDERSON, P.W., Arrow, K.J., Pines, D. (ed.) (1988) The Economy as an Evolving Complex System, Addison-Wesley Longman.

ANDREONI, J., Miller, J. (1995) Auctions with adaptive artificial agents Games and Economic Behavior, 10, pp. 39-64.

ARIFOVIC, J. (1995) Genetic algorithms and inflationary economies Journal of Monetary Economics, 36(1), pp. 219-243.

ARIFOVIC, J. (1996) The behavior of the exchange rate in the genetic algorithm and experimental economies Journal of Political Economy, vol. 104 n. 3, pp. 510-541.

ARTHUR, B. (1991) On designing economic agents that behave like human agents: A behavioral approach to bounded rationality American Economic Review, n. 81, pp. 353-359.

ARTHUR, B. (1994) Inductive reasoning and bounded rationality American Economic Review, n. 84, p. 406.

ATTANASIO, O.P. and Weber, G. (1993) Consumption Growth, the Interest Rate and Aggregation Review of Economic Studies 60, pp. 631-649.

ATTANASIO, O.P. and Weber, G. (1994) The UK Consumption Boom of the Late 1980's: Aggregate Implications of Microeconomic Evidence The Economic Journal 104, pp. 1269-1302.

AXELROD, R.M. (1987) The Evolution of Strategies in the iterated Prisoner's Dilemma, in L.D. Davis (ed.) Genetic Algorithms and Simulated Annealing, London, Pitman, pp. 32-41.

AXELROD, R. (1997), The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration, Princeton, New Jersey: Princeton University Press

BAILEY, K.D. (1988) The Conceptualization of Validity. Current Perspectives Social Science Research, 17, pp. 117-136.

BAUER, B., Odell, J., Parunak, H. (2000) Extending UML for Agents in G. Wagner, Y. Lesperance, E. Yu (eds.) Proceedings of the Agent-Oriented Information Systems Workshop (AOIS), Austin, pp. 3-17.

BAUER, B., Muller, J.P., Odell, J. (2001) Agent UML: a formalism for specifying multiagent software systems International Journal on Software Engineering and Knowledge Engineering (IJSEKE), vol. 1, n. 3.

CHATTOE, E.S., Saam, N.J., Möhring, M. (2000) Sensitivity analysis in the social sciences: problems and prospects in G.N. Gilbert, U. Mueller, R. Suleiman, K.G. Troitzsch, (eds.) Social Science Microsimulation: Tools for Modeling, Parameter Optimization, and Sensitivity Analysis, Heidelberg, Physica Verlag, pp. 243-273.

CLARKSON, G.P.E., Simon, H.A. (1960), Simulation of Individual and Group Behavior, The American Economic Review, vol. 50, n. 5, pp.920-932

COHEN, K.J. (1960), Simulation of the firm The American Economic Review, vol. 50, n. 2, Papers and Proceedings of the Seventy-second Annual Meeting of the American Economic Association, pp. 534-540

COHEN, K.J., Cyert R.M. (1961), Computer Models in Dynamic Economics The Quarterly Journal of Economics, vol. 75, n. 1, pp. 112-127

DAWKINS, C., Srinivasan T.N., Whalley J. (2001) Calibration in J.J. Heckman, E. Leamer. (eds.) Handbook of Econometrics. Vol. 5. Elsevier, pp. 3653-3703.

DEIF, A.S. (1986) Sensitivity Analysis in Linear Systems, Berlin, Springer.

VAN DIJKUM, C., Detombe, D., van Kuijk, E. (eds.) (1999) Validation of Simulation Models, Amsterdam, SISWO.

EPSTEIN, J.M. (1999), Agent-Based Computational Models And Generative Social Science Complexity, vol. 4, pp. 41-60

FIACCO, A.V. (1983) Introduction to Sensitivity and Stability Analysis in Non-linear Programming, Paris, Academic Press.

FIACCO, A.V. (ed.) (1984) Sensitivity, Stability and Parametric Analysis, Amsterdam, North-Holland.

FREEMAN, R. (1998) War of the models: Which labour market institutions for the 21st century? Labour Economics, 5, pp. 1-24.

GODE, D. K., Sunder, S. (1993) Allocative efficiency of markets with zero-intelligence traders: Markets as a partial substitute for individual rationality Journal of Political Economy 101, pp. 119-137.

GOURIEROUX, C., Monfort, A. (1997) Simulation-based econometric methods, Oxford University Press.

HANSEN, L. P., and Heckman, J. J. (1996) The Empirical Foundations of Calibration, Journal of Economic Perspectives, vol. 10(1), pages 87-104.

HENDRY, D.F., Krolzig, H.M. (2001) Automatic econometric model selection, London, Timberlake Consultants Press.

HUBERMAN, B.A., Glance, N. (1993) Evolutionary Games and Computer Simulations, Proceedings of the National Academy of Sciences of the United States of America, 90, pp. 7716-7718.

HUGET, M. (2002) Agent UML class diagrams revisited in B. Bauer, K. Fischer, J. Muller, B. Rumpe (eds.), Proceedings of Agent Technology and Software Engineering (AgeS), Erfurt, Germany.

KALAITZIDAKIS, P., Mamuneas, T.P., Stengos, T. (2003) Rankings of academic journals and institutions in economics Journal of the European Economic Association vol. 1, n. 6, pp. 1346-1366.

KLEIJNEN, J.P.C. (1992) Sensitivity Analysis of Simulation Experiments: Regression Analysis and Statistical Design Mathematics and Computers in Simulation, n. 34, pp. 297-315.

KLEIJNEN, J.P.C. (1995a) Sensitivity Analysis and Optimization of System Dynamics Models: Regression Analysis and Statistical Design of Experiments in System Dynamics Review 11, pp. 275-288.

KLEIJNEN, J.P.C. (1995b) Sensitivity Analysis and Related Analyses: A Survey of Some Statistical Techniques Journal of Statistical Computation and Simulation, 57, pp. 111-142.

KLEVORICK, A.K. (ed.) (1983) Cowles fiftieth anniversary, Cowles Foundation, New Haven, Connecticut.

KÖHLER, J. (1996) Sensitivity Analysis of Integer Linear Programming, Discussion Paper, Fachbereich Mathematik und Informatik, Universität Halle-Wittenberg.

KYDLAND, F.E., Prescott, E.C. (1996), The Computational Experiment: An Econometric Tool Journal of Economic Perspectives, vol. 10, pp. 69-85

LANCASTER, T. (1990) The Econometric Analysis of Transition Data, Cambridge University Press.

LAW, A., Kelton, W.D. (1991) Simulation Modeling and Analysis, New York, McGraw-Hill, second edition.

LAWLESS, J. (1982) Statistical Models and Methods for Lifetime Data, John Wiley.

LEOMBRUNI, R. (2002), The methodological Status of Agent Based Simulations LABORatorio R. Revelli Working Paper no. 19

LEOMBRUNI, R., Richiardi, M.G. (2005) Why are Economists Sceptical About Agent-Based Simulations?, Physica A. Vol.355, No. 1, pp. 103-109.

MARIANO, B., Weeks, M., Schuermann T. (eds.) (2000), Simulation Based Inference: Theory and Applications, Cambridge, Cambridge University Press.

MERZ, J. (1994) Microsimulation - A Survey of Methods and Applications for Analyzing Economic and Social Policy, FFB Discussion Paper, n. 9, Universität Luneburg, June.

MILLER, J.H., Rust, J., Palmer, R. (1994) Characterising Effective Trading Strategies: Insights from a Computerised Double Auction Tournament Journal of Economic Dynamics and Control, 18, pp. 61-96.

MINAR, N., Burkhart, R., Langton, C., Askenazi, M. (1996) The Swarm Simulation System: A Toolkit for Building Multi-agent Simulations, Santa Fe Institute Working Paper, n. 96-06-042.

OMG (2003), Object Management Group, http://www.omg.org.

ORCUTT, G.H. (1960), Simulation of Economic Systems The American Economic Review, vol. 50, n. 5, pp. 893-907

RESNICK, M. (1994) Turtles, Termites and Traffic Jams: Explorations in Massively Parallel Microworlds, Cambridge MA, MIT Press.

ROS INSUA, D. (1990) Sensitivity Analysis in Multi Objective Decision Making, Berlin, Springer.

SCHELLING, T. (1969) Models of Segregation in American Economic Review, n. 59, pp. 488-493.

SHUBIK, M. (1960), Simulation of the Industry and the Firms The American Economic Review, vol. 50, n. 5, pp.908-919

SI ALHIR S. (2003), Learning UML, O' Reilly & Associates

SONNESSA, M. (2004) Modelling and simulation of complex systems, PhD Thesis, "Cultura e impresa", University of Torino, Italy.

STANISLAW, H. (1986) Tests of computer simulation validity. What do they measure? Simulation and Games 17, pp. 173-191.

STERMAN, J.D. (1984) Appropriative summary statistics for evaluating the historic fit of system dynamics models Dynamica 10, pp. 51-66.

TESFATSION, L. (2001a) Special issue on agent-based computational economics Journal of Economic Dynamics and Control 25, pp. 3-4.

TESFATSION, L. (2001b) Special issue on agent-based computational economics Computational Economics , vol. 18, n. 1.

TESFATSION, L. (2001c) Agent-based computational economics: A brief guide to the literature, in J. Michie (ed.) Reader's Guide to the Social Sciences, vol. 1, London, Fitzroy-Dearborn.

TRAIN, K. (2003), Discrete Choice Methods with Simulations, Cambridge, Cambridge University Press.

TULLOCK, G., Campbell, C.D. (1970) Computer simulation of a small voting system Economic Journal, vol. 80 n. 317, pp. 97-104.

VANDIERENDONCK, A. (1975) Inferential simulation: hypothesis-testing by computer simulation Nederlands Tijdschrift voor de Psychologie 30, pp. 677-700.

WAN, H.A. Hunter, A., Dunne, P. (2002) Autonomous Agent Models of Stock Markets Artificial Intelligence Review 17, pp. 87-128.

WEIBULL, J.W. (1995) Evolutionary Game Theory, Cambridge, MA, MIT Press.

WEISBUCH, G., Kirman, F., Herreiner, A. (2000) Market organization and trading relationships Economic Journal 110, pp. 411-436.


ButtonReturn to Contents of this issue

© Copyright Journal of Artificial Societies and Social Simulation, [2006]