Comparing Prediction Market Mechanisms : An Experiment-Based and Micro Validated Multi-Agent Simulation

Prediction markets are a promising instrument for drawing on the “wisdom of the crowds". For instance, in a corporate context they have been used successfully to forecast sales or project risks by tapping into the heterogeneous information of decentralized actors in and outside of companies. Among the main market mechanisms implemented so far in prediction markets are (1) the continuous double auction and (2) the logarithmic market scoring rule. However, it is not fully understood how this choice a ects crucial variables like prediction market accuracy or price variation. Our paper uses an experiment-based andmicro validated simulation model to improve the understanding of the mechanism-related e ects and to inform further laboratory experiments. The results underline the impact ofmechanism selection. Due to the higher number of trades and the lower standard deviation of the price, the logarithmic market scoring rule seems to have a clear advantage at a first glance. This changes when the accuracy level, which is the most important criterion from a practical perspective, is used as an independent variable; the e ects become less straightforward and depend on the environment and actors. Besides these contributions, this work provides an example of how experimental data can be used to validate agent strategies on the micro level using statistical methods.


Introduction
. Prediction markets can be described as markets that are "designed specifically for information aggregation and revelation" (Wolfers & Zitzewitz , p. ) .They are a promising example of using the "wisdom of the crowds" (Surowiecki ).The basic idea is to give individuals the possibility to trade their expectations concerning the relevant variable on a virtual market (e.g., the expected number of sales of a product in the next year).Via their trades, the local information of individual actors is aggregated in form of the market price that is visible to all market participants.At the same time, the information of the individual traders is not disclosed to the participants.Prediction markets can be used to reveal and aggregate the diverse information even of large and (geographically) dispersed groups.These characteristics distinguish them from traditional forecasting methods like expert forecasting or statistical methods.) and in practice (Atanasov et al. ; Chen & Plott ; Cowgill & Zitzewitz ).Hewlett Packard (HP) provides a good example for a successful application of prediction markets.HP has used them to project figures associated with the o icial sales forecast (Chen & Plott ).Among the predicted figures were the next month's revenues for a specific product, the next month's unit sales of another product and the next quarter's unit sales.Between and participants traded on a prediction market based on a continuous double auction.The o icial sales forecast was unknown to the traders.As the prediction market owner, HP was interested in an accurate prediction and o ered participants monetary incentives linked to the individual's success on the market.Compared to the traditional sales forecast, the results of the prediction market were considered a substantial improvement.In six of eight cases, the prediction market outperformed the traditional sales forecast.Additionally, the prediction market correctly indicated the direction of the deviation for all eight cases . .
A major design question for the set-up of a prediction market is the choice of the market mechanism (Spann & Skiera , p. ), i.e. how individuals can trade on the market.It is linked to research about e icient information aggregation in prediction markets and can be seen as a foundational aspect of prediction markets (see Klingert ). Di erent market mechanisms have been applied to prediction markets to coordinate the trading interactions between individual actors.Our work focuses on two of the most common ones -the continuous double auction CDA (e.g.used in the Iowa Electronic Market by Forsythe et al. , p. ) and an automated market maker, the logarithmic market scoring rule LMSR (Hanson ; e.g. used in the Gates Hillman Prediction Market by Othman & Sandholm a, p. ).
. Present research documents the functioning of prediction markets (Atanasov et al. ), but the e ects of di erent market mechanisms on a market's performance are o en sidestepped in this context .The average performance of prediction markets has been "pretty good" (Wolfers & Zitzewitz , p. ), but some failures to adequately aggregate information have been reported (e.g., Hansen et al. ).The fundamentals of prediction markets are su iciently understood, but the investigation of the forecasting performance is ongoing (Horn et al. , p. ).The market mechanism is one direction of this research.Field studies of prediction markets (e.g., Forsythe et al. ; Hansen et al. ; Othman & Sandholm a) have been unable to fully assess the e ect of the mechanism because varying it would substantially increase the e ort.Furthermore, the market mechanism is o en chosen without a thorough discussion or at least without documenting it.Some so ware providers do not even o er alternative prediction market mechanisms; Crowdworx focuses on the LMSR, for example (Ivanov ).Nevertheless, some di erences are known.From a technical perspective, the LMSR offers constant liquidity and only requires one trader to execute a transaction while the CDA requires at least two.As a higher number of trades is typically associated with an increase in accuracy, the mechanism choice might be a crucial aspect concerning the performance of a prediction market (Antweiler ).
. The possible role of the choice of an appropriate market mechanism stands in contrast to the current level of understanding of the e ects that di erent mechanisms have on prediction market outcomes (Healy et al. , p. ).While laboratory experiments have focused on a small selection of aspects such as the interplay with the information distribution among traders (Healy et al. ; Ledyard et al. ), the e ects of di erent environments in conjunction with a specific mechanism are particularly unclear (Healy et al. , p. ).Three aspects deserve further attention.First, existing studies do not vary important aspects like the initial money endowment and the actors' trading strategies (Rothschild & Sethi ).Second, some results contradict each other.According to Healy et al. ( ), the accuracy of the LMSR is much lower than the accuracy of the CDA in a simple environment with few traders.This di ers from the outcome of an experiment mentioned in a talk by Ledyard referenced in the same paper (Healy et al. , p. ) and by Ledyard et al. ( ).The available laboratory experiments (Healy et al. ; Ledyard et al. ) share the same problem as they rely on a relatively small number of traders (between three and six).An average prediction market has more traders than that .Third, other experimental (Jian & Sami ) as well as simulation (Slamka et al. ) studies compare market mechanisms without considering the widely used CDA.

.
This lack of knowledge concerning the performance e ects of the most commonly used market mechanisms is problematic as faulty prediction market outcomes might lead to wrong decisions.Due to the increased popularity of prediction markets in the corporate context (Bray et al. , p. ), the relevance of di erences between mechanisms has risen further.Corporate prediction markets might also face specific conditions that can a ect the performance of the market mechanism (Cowgill & Zitzewitz ).For example, the number of traders might be limited in corporate internal markets.Consequently, the decision of traders with limited skill in stock market trading might have more weight than in large prediction markets (Rothschild & Sethi ).Therefore, random strategies might represent some traders better than other strategies (Cowgill & Zitzewitz ), for example, a strategy based on the expected value.
. Against this backdrop, our paper provides a comparative analysis of the CDA and the LMSR concerning relevant prediction market output variables like number of trades, standard deviation of the price and accuracy level .It contributes to understanding the mechanism-related e ects and the dynamics of the collective information aggregation of the participants by introducing and analyzing an agent-based simulation model.It follows an iterative process to establish strong links with economic laboratory experiments (Klingert & Meyer b) and it is based on a laboratory experiment, i.e. the simulation model is constructed and micro as well as macro validated using the experimental data of Hanson et al. ( ).We also aim at providing input for future laboratory experiments by identifying aspects that warrant further investigation.Our results underline the impact of the mechanism selection.Due to the higher number of trades and the lower standard deviation of the price, the LMSR seems to have a clear advantage at a first glance.However, considering the accuracy level as independent variable, shows that the e ects are actually more complex and depend on the environment and actors.

.
The paper is structured as follows.First, a review of relevant literature is provided and the hypotheses for the simulation experiments are derived.Second, the simulation model is introduced.Third, the model is micro as well as macro validated.Fourth, simulation experiments are conducted, analyzed and tested for robustness.
The paper concludes with a discussion of the results and a brief outlook.

Literature Review and Hypothesis Development
. This paper provides a comparative analysis of the CDA and the LMSR.The CDA has traditionally been the standard prediction market mechanism (Chen & Pennock ).Like on many financial markets, two traders directly exchange stocks and monetary units as it is illustrated in Figure .Asks and bids are placed in an order book and can subsequently be accepted by other traders.If an ask or bid is accepted, stocks and monetary units are exchanged and it is deleted from the order book.-) enabling the LMSR to act like an "intermediary between people who prefer to trade at di erent times" (Hanson , p. ).A loss of the market maker is accepted in such prediction markets, and has to be covered by the market owner, but unlimited losses are not incurred (Hanson , p. ).

Figure :
The market maker as a central trader.
. Ken Kittlitz, Chief Technology O icer at the so ware provider Consensus Point, points out important di erences between the two market mechanisms: "Having run markets both with and without Hanson's automated-market maker [LMSR], we say with confidence that it makes a huge di erence to the success of the market.Because it maintains buy and sell orders at a wide range of prices, it provides a steady source of liquidity that would otherwise be lacking.This allows traders to interact with the system in an easy and intuitive manner rather than having to worry about placing booked orders at certain prices and waiting for other traders to match those orders."(cited in Hanson , p. ) .
In this section, we will break several of his field observations down into testable hypotheses.To this end, first, we define criteria for the evaluation of the market mechanisms.Then, we draw upon prior research and the technical di erences between the two mechanisms to derive our hypotheses.
. Concerning related research it is important to mention the study by Brahma et al. ( ).They compare a recently suggested liquidity-sensitive variant of the LMSR (LS LMSR) with a Bayesian Market Maker.Their paper provides many useful insights in the LMSR -e.g., the amount of loss incurred in such markets -and can be seen as complementary to this study, because it does not address the CDA.Still, the methodological approaches slightly di er, as our paper is validated based on a laboratory experiment, wants to explore parameter values and combinations not explored in this experiment and finally to guide future laboratory experiments (Klingert & Meyer b).Brahma et al. ( ) use laboratory studies as an additional way to evaluate their Bayesian Market Maker in comparison to the LMSR.

.
The prediction market mechanisms will be assessed based on three evaluation criteria.The first criterion is the quantity of trades, because the number of trades is important for the success of a market.An increase in the number of trades o ers the opportunity to add more pieces of information to the market price.The second criterion is accuracy and is measured by the accuracy level.According to Hanson et al. ( ), it represents a key figure and mainly determines the quality of a prediction market.The accuracy level is more important than the number of trades because a higher number of trades does not necessarily lead to a high predictive quality, e.g., when traders act randomly.The accuracy level is defined as the variance between the price and the correct value (Hanson et al. , p. ).The third criterion is the standard deviation of the price (e.g., used by Jian & Sami ).It measures reliability and assumes that a good average accuracy might still not be su icient in itself.Because extreme predictions might have severe consequences, it is important to be aware of them.A substantial reduction of price variation can even make a slight reduction of the average accuracy acceptable.
. Data to assess the three criteria is collected a er the prediction market closes.The accuracy level represents the most important measure of prediction market quality.Because the accuracy is expected to only partially depend on the mechanism, interaction e ects with other variables are considered.Besides, the number of trades and the standard deviation of the price are explored.While we expect the insights regarding these two criteria to be less complex, their analysis further contributes to the model validation .
. The first hypothesis addresses the number of trades.From a technical perspective it seems quite straightforward, that the LMSR has an advantage in achieving a higher number of trades.The CDA needs at least two traders to execute a trade whereas the LMSR provides constant liquidity and should be able to act like the second trader.Still one can consider situations, in which the LMSR has less trades compared to the CDA.The LMSR has a certain spread between the prices to sell and buy which can lead to no trades in the presence of traders willing to trade for prices within this spread.Contrary in CDA, the spread is defined by the orders in the order book and can be smaller.Still, the advantage of the LMSR concerning an increased number of trades has been observed in the field.Ken Kittlitz has noted that the "number of trades in a market using the market maker is at least an order of magnitude higher than in one not using it" (cited in Hanson , p. ).In contrast, prediction markets with the CDA may su er from low liquidity which leads to manual interventions of market owners (Antweiler ; Chen & Plott , p. ).Therefore, the first hypothesis is formulated: Hypothesis : The LMSR has a higher number of trades than the CDA.
. The second criterion is accuracy.It is central to prediction markets as the accuracy level measures the quality of the prediction market result, but stating a hypothesis for or against one of the mechanisms is less intuitive.Because the LMSR was introduced a er the CDA and specifically addresses some of its problems, it might be seen as favorable.The following aspect in particular could be seen as a distinct advantage of the LMSR.Trades are needed to incorporate the information of the agents and a higher number of trades is typically related to an increased accuracy.However, most properties of the mechanisms have two sides.Having a market maker (LMSR) might be beneficial, if no trades are expected with very few traders.In the absence of no trades and assuming an early end of the market, the trade volume might be too low to achieve an accurate value when using the LMSR.
The market maker requires certain minimum volume to adjust the price.The CDA is able to change the price with the trade of a single stock.Empirical research does not give a clear direction in this regard because two parallel prediction markets to directly compare the CDA and the LMSR are hardly implemented.Currently, the LMSR is the standard market maker mechanism used at several companies including Inkling Markets, Microso and Yahoo (Chen & Pennock ).One may assume that the LMSR's accuracy is a driver of its success.It might come as a surprise that the experimental literature paints a di erent picture.According to Healy et al. ( ), the accuracy of the LMSR is much lower than the accuracy of the CDA in a simple environment with few traders.Their results are not consistent with the outcomes of other experiments, e.g., an experiment mentioned in a talk of Ledyard et al. ( ) referenced in the same paper (Healy et al. , p. ).Overall, the LMSR is supposed to be the slightly superior mechanism considering its positive reception due to the enhanced liquidity and the slight advantage for the LMSR in the experimental studies.This leads to our second hypothesis: Hypothesis : The LMSR is more accurate than the CDA.

.
We augment this perspective on accuracy and add two sub-hypotheses to address potential interaction e ects for three reasons.First, the technical di erences regarding this research question do not give a clear direction regarding the e ect of the market mechanism on accuracy.Second, literature does not suggest a clear direction too.Third, the accuracy level can be regarded as the most important criterion.
. Hypothesis a addresses the presence of random strategies.We study this, because some participants in a prediction market might not be trained stock traders.The LMSR is supposed to be superior in this case for several reasons.The LMSR restricts the action space more than the CDA and, thus, traders have fewer options.The LMSR also requires less sophisticated strategies, because the size of the action space is stable and the acceptance of asks and bids is more certain.Healy et al. ( , p. ) claim that confused traders could influence both, the CDA and the LMSR, and do not determine a clear advantage for one of the mechanisms.We follow the theoretical considerations with our hypothesis a: Hypothesis a: The LMSR is more accurate in the presence of random strategies than the CDA.

.
Hypothesis b addresses "extreme information distributions".Extreme information distributions reflect a correct value at the border of the possible states.If , and are the possible states, information distributions with the correct value and are considered as "extreme".Extreme states can be very relevant in practical settings.The CDA is supposed to have an advantage over the LMSR, because it has a broader action space which enables traders to achieve large price movements with low trade liquidity.The LMSR requires the traders to bet against the liquidity of the market maker first which prevents them from immediately moving the price to an extreme value.

Hypothesis b:
The CDA is more accurate for "extreme information distributions" than the LMSR.

.
The third criterion is the standard deviation of the final price.It can be an important measure for applications because it reflects the reliability of a prediction market.A high probability of a small deviation might be more acceptable than a low probability of a high deviation if the latter results in a disastrous decision.The LMSR can be expected to have a lower standard deviation because of the restricted action space for traders.Furthermore, the traders can only choose from three actions at any given time: they can accept to buy from the market maker, sell to the market maker or do nothing.Furthermore, the prices of the market maker direct the trading and the liquidity prevents fast price changes.Therefore, extreme trades do not immediately result in extreme deviations from the correct value.Still, the empirical literature is contradictory in this regard.Healy et al. ( , p. ) report a higher variability in the distance between output and correct result for the LMSR.However, they are not reporting the exact standard deviation and their results are at odds with their theoretical considerations.Overall, we follow the theoretical considerations with our third hypothesis: Hypothesis : The LMSR has a lower standard deviation of the final price than the CDA.

Simulation Model
. The purpose of the agent-based simulation model is to analyze the e ect of the two mechanisms on the number of trades, the accuracy of results and their reliability.A simulation model is applied to analyze the e ects of interest, because prediction markets are markets from a technical point of view and, as such, a complex system (Tseng et al. , see).The collective process of information aggregation is rather di icult to predict because the price is as much influenced by traders as they might be influenced by the price themselves.Furthermore, changing the sequence of otherwise identical trades might yield di erent outcomes.We use an experimentbased simulation to analyze the influence of di erent factors including relationships that can hardly be controlled and detangled in reality like e.g., the heterogeneous strategies of actors which have been observed in real prediction markets (Rothschild & Sethi ).
. Using a simulation model to complement existing economic laboratory experiments has at least two advantages (for a more detailed comparison of simulation and laboratory experiments see Klingert & Meyer ( b)).First, actor strategies can be controlled and, therefore, intentionally manipulated.Second, simulation experiments can be executed more e iciently which enables a much broader experimental design including up to factors and simulation runs for each factor combination.The simulation itself benefits from the strong link to laboratory experiments because outcomes of the default model are validated and strategies are chosen by classification based on experimental data.In this section, only an excerpt of the model description and model design considerations can be given due to space limitations.A more detailed documentation based on the ODD protocol (Grimm et al. ) can be found in Klingert ( ) .
. We have chosen the well-documented and influential laboratory experiment of Hanson et al. ( ) as the starting point.The authors gave us access to their experimental data, which allowed us to micro and macro validate the model.Consequently, the default setting of our simulation model strongly resembles their experiment .Twelve agents are initially presented with monetary units and stocks.Both stocks grant the right to receive a payo of , or with equal probability at the end of the trading period.The agents know individually that one value can be excluded with certainty from the possible outcomes.If the true value is , half of the agents know it is not and half of the agents know it is not , for example.Therefore, an agent, which can dismiss as the true value, knows that the true value is either or and that the stock has an expected value of .
. The simulation and the experiment di er only slightly in their procedure.One of the di erences is that the simulation model is executed stepwise instead of the continuous flow of time in the experiment of Hanson et al. ( ).Every CDA-based step allows the agents to place a bid or ask limited to the natural numbers between [ , ].Alternatively, they can accept an order from the order book.If an o er is accepted, the trade is executed and money and stocks are exchanged.The agent order is determined randomly to align the simulation with the laboratory experiment .A er the simulation ends with completing step , it shows a comparable number of trades ( -trades instead of -) to the laboratory experiment which lasts minutes (Hanson et al. , p. ).Consistent with other market simulations (Gode & Sunder , p. ), the agents are limited to trade one stock per step and the unmatched o ers are deleted a er a trade to simplify the decision-making; the agents are allowed to place the same o er in the next step.

.
To compare market mechanisms, the simulation model goes beyond the laboratory experiment and is varied along the environment and the agents .The market institution is defined by its rules, i.e. the market mechanism.The CDA and the LMSR are used in the simulation experiments.The implementation of the LMSR demands a concrete b-value which determines the maximum loss of the market maker .To achieve a comparable result for the CDA and the LMSR, the b-value of the LMSR should be linked to the CDA setup.The maximum costs of the CDA equal the initial endowment of money and the value of the initial stocks per agent.Given a certain correct value in CDA, these costs are always constant, and therefore the maximum costs equal the minimum costs independent from the market outcome.In LMSR, the agents are endowed with monetary units as well.Contrary to CDA, they are not holding any stocks at the beginning.This is unnecessary because each trade is executed with the market maker as a trading partner.Instead, the b-value is chosen in a way that the maximum loss of the market maker is comparable to the costs of a market based on a CDA .This ensures a fair comparison of both mechanisms.

.
The environment is mainly defined by the information distribution.The standard task is to predict a state out of the three possible states , and (as in Hanson et al. ).In each of the states, % of the agents know that one of the wrong values is not the true state and the other % know that the other wrong value is not the true state.The information distributed among all agents can be considered as complete and certain as the exact value could be easily determined, if the information of all agents was publicly available.Each individual agent has incomplete, but partly certain information because one of the states can be excluded with a certainty of %.In addition, other information distributions are chosen to test the robustness of the results.
. Finally, the agents trade based on simple rules.The "family" of zero intelligence (ZI) traders is used because these strategies fulfill four criteria.First, the zero intelligence agents have been used in a large number of market simulations (Chen ), which also allows for relating our results to previous research.Second, the simplicity of the zero intelligence traders allows for focusing on the influence of the market mechanism in the model.Third, the zero intelligence agents have already been validated at the macro level by prior research that has recognized similar e iciency levels for markets with ZI traders and markets with human traders (Gode & Sunder , p.
).Therefore, zero intelligence agents seem to be appropriate to direct subsequent experiments.Fourth, strategies that can be validated at the micro-level are desirable.The zero intelligence strategies have not been micro validated statistically in the original papers (e.g., Gode & Sunder ) .However, the simplicity of the zero intelligence strategies allows for statistical micro validation (see Appendix).

.
The verification (cf.Gilbert & Troitzsch , p. ) of the model relies on three basic procedures: The model is based on (semi-)formalized models, it was tested several times and the code was inspected by a step-by-step debugging.

.
The validity of our model is ensured in three ways with a particular focus on the strategies of the agents.) provide the starting point for selecting the agents' strategies.In this work, the micro validation does not aim at showing human traders behave exactly like ZI-traders.Instead, the ZI-strategies are only tested for a best fit at the micro level.While comparisons between the ZI-agents and macro data from laboratory experiments exist (e.g. in Gode & Sunder ), a micro validation has not been performed yet.
. In general, strategies can be validated at the micro level with a variety of methods like classification, calibration or clustering.Classification starts with a pre-selection of certain agent strategies on a theoretical basis and assigns the actors to the strategies (see Kantardzic , p. ).Calibration goes beyond classification by adapting the strategies, e.g., by choosing certain parameter values, to better represent the actions of the actors (see Boero & Squazzoni ).Clustering does not start with classes, but tries to identify groups in the behavior of the actors instead and defines strategies with the best fit for these groups (see Kantardzic , p. ).
. The purpose of the simulation determined our choice of classification as validation method.We aim neither at prediction nor at pure replication, but at explaining and going beyond the setup of the laboratory experiment.Furthermore, the mechanism rather than the agent behavior represents the research focus.Classification does not adapt strategies or develop new strategies and, thus, o ers at least three main advantages over the other two validation methods.First, the attributes of classification potentially reduce the problem of overfitting (Kantardzic , p. ) present in calibration techniques .Second, it allows for selecting from a pool of strategies that were applied and tested by prior research.Third, keeping the strategies simple from a theoretical perspective allows for focusing on the mechanisms.Overall, these advantages do not only ensure that the strategies are suitable for the experimental setup they are validated for, but also enable simulation experiments to go beyond the laboratory experiment.
. Overall, as a result of our micro validation, homogenous and heterogeneous models are selected (see Appendix for details).Two homogenous models, i.e., models where all traders are following the same strategy, are suggested based on the micro validation.First, fundamental trading is represented by ZI.EV traders.The ZI.EV traders (derivation from ZIC, see Gode & Sunder ) and N-ZI (Du y & Ünver ) are selling above and buying below the expected value (exact price is chosen randomly with a uniform distribution like in Gode & Sunder ( ).Second, a mix between fundamental and trend trading is implemented via N-ZI traders (Du y & Ünver ).These traders weight the expected value and the last price to select when to buy and to sell.
. The list is extended by a learning model.These zero intelligence plus traders (ZIP) (ZIP, Cli & Bruten ) adapt a profit margin based on the success of the last trades and only have a small price span in which they trade.This model has not been micro validated due to the complexity which results from the learning model's memory and its ability to adapt to its environment.The ZIP agents are used, because they are an instance of the zero intelligence family as well as a learning strategy.For micro validation, zero intelligence unconstraint traders (ZIU) are used as a benchmark model.ZIU traders (ZIU, Gode & Sunder ) randomly decide to buy or sell and at what price.This is the trading strategy from the original paper by Gode & Sunder ( ).
. Finally, heterogeneous models containing agents with di erent strategies are added to the list in Table .While both the default and the alternative model assume one strategy for all agents, heterogeneous models combine two pure strategies.Furthermore, heterogeneous models consider information distribution and contain at least two agents for each strategy .Mix considers the equal division of the actors and is restricted by a balanced information distribution to outweigh the ZI.EV strategy.It comprises the two strategies derived from the validation without the ZIU strategy.Therefore, the heterogeneous model Mix can be seen as a "lower boundary" of random traders.The heterogeneous model Mix contains the two strategies derived from the validation as well as the ZIU strategy given that the ZIU strategy could not be rejected for actors in the combinatorial case.Therefore, it represents the idea of an "upper border" of random traders.Note: The learning model, zero intelligence plus (ZIP), has not been micro validated because of its higher complexity.

No. Model name
Table : Strategy combinations used in the simulation model.
. The model is also macro validated (Klingert , see), even though market outcomes with zero intelligence agents have already been compared with macro data in the past (Gode & Sunder ). Comparing the stylized model results from di erent strategy combinations with the empirical results of Hanson et al. ( ) leads to similar patterns or stylized facts at the macro level (see also Grimm et al. ; Heine et al. ).For example, a similar average accuracy for ZI.EV, N-ZI traders and the Mix model can be observed .Mix with random traders and the learning model ZIP yields a lower average accuracy.However, it is superior in rebuilding the distribution of the accuracy level and can be seen as an upper boundary of randomness.As the learning model, ZIP represents a di erent group of strategies.Therefore, all strategy combinations (besides the model where all traders are following the fully random ZIU strategy) are included in the model.

Experimental Design, Results and Robustness Tests
. In this section, the experimental design and results as well as tests for robustness are described.Using an experimental design for multi-agent models has several advantages (see Lorscheid et al. for more detailed explanations and further literature).First, it provides a very economical and e ective way to communicate the way we analyzed the behavior of our model and the results of these experiments.Looking at the factors one can see which model parameter are varied during our simulations experiments and with which values.Table shows the factors (see second column "factors") that are varied in a k-experimental design, i.e. an experimental design with factor levels per ratio-scaled variable (see the columns "low", "default" and "high").Second, it avoids the problem of sensitivity analyses, which vary only one parameter at a time, and allows for the systematic detection of interaction e ects.The k-is chosen over a k-experimental design because it allows for choosing the low and high values linearly around the default value and, thus, systemizes the analysis of results.Furthermore, it allows for the identification of possible non-linear e ects (Law ; Lorscheid et al. ). ), the default factor levels are selected accordingly.The simulation ends a er steps to represent the minutes of the experiment.Like in the experiment, agents receive an initial endowment of stocks and monetary units.The CDA represents the default mechanism both in the simulation and in the experiment.The default strategy is the ZI.EV strategy as it best represents the actors in the micro validation when comparing pure strategies.The information distribution is aligned with Hanson et al. ( ) as well.In assigning the same probability to the possible values , and , % ( %) of the agents know that the true value is not ( ).

Scale
. The analysis in the following section is carried out from di erent perspectives.A er introducing the simulation results by means of an exemplary simulation, the results are documented along the three evaluation criteria (cf.Kleijnen ; Lorscheid et al.
).First, each hypothesis is tested comparing the averages of runs that are based on the default setting.Second, further tests are performed including the comparison of the averages over all factor combinations and for each of the factor combinations in the experimental design.Third, an overview of the e ect sizes of main and -way-interaction e ects is given.The presentation of the ANOVA results focus on e ect sizes rather than on statistical significance, as for most factors the latter is much less important in simulation experiments (Troitzsch ).This is driven by very small p-values for the majority of factors that result from the experimental design, which involves simulation runs per factor combination (Secchi & Seri ). Partial η 2 is chosen as the e ect size measure to balance the influence of the factors and the size of the error.Therewith, this measure documents the e ect of a certain treatment.If a treatment has a higher e ect size compared to other treatments, its influence on the dependent variable can be considered as more important.Finally, the results are tested for robustness from two further perspectives: environments assuming extreme factor levels and assuming di erent information distributions.

Exemplary simulation runs
.
Figure shows one exemplary simulation run for each mechanism.It depicts the prices for both mechanisms for all steps within the default setup.

Number of trades
. Results displayed in Table o er strong support for hypothesis .The first column gives the averages of runs per mechanism in the default setting.The LMSR has a clear advantage and achieves an average of more than three times the trades compared to the CDA.This result in the default setting is highly statistically significant.The second column shows the averages of , runs per mechanism over all possible combinations.Again, the advantage of the LMSR is obvious.The CDA achieves more trades than in the default setting on average, but the LMSR still results in nearly twice the number of trades.The third column compares the averages of runs per mechanism in each of the , factorial combinations.If the CDA has more trades than the LMSR for a certain setting, it is counted as a winning setting for the CDA.The case is counted against the hypothesis for equal averages.The LMSR is superior to the CDA once more.In , factorial combinations ( .%) the LMSR has more trades than the CDA.

Default setting
All settings Note: Welch two sided t-test for means and exact binomial test assuming a probability of % for winning settings with independent variable number of trades ( *** p < .001).

Mean trades
Table : Means and number of winning settings with independent variable "number of trades".
. Interestingly, the LMSR does not perform better in every setting.The majority of the cases against the hypothesis, which signalize an advantage of the CDA, are driven by the ZIP-strategy ( of factorial combinations are ZIP settings).Agents applying the ZIP strategy learn a price range.This range might be too small to move the given prices of the LMSR, but it can be su icient to exchange stocks at a constant price over a longer time period .Thus, the CDA has an advantage in these cases.A reason for the higher number of trades in LMSR in most of the other cases could be that the LMSR provides steady liquidity and, therefore, an execution of a trade is always possible.Assuming there is only one ask and one bid, the CDA results in a maximum of one trade while the maximum in the LMSR is two.

.
Overall, the mechanism induced e ects on the number of trades are strong and the LMSR is superior in achieving a high number of trades.
. To understand the importance of the influence of the market mechanism compared to other factors varied in the experiment, an analysis of variance (ANOVA) is conducted over all settings.The main e ect "market mechanism" emerges as the third largest e ect (. ) in an ANOVA including all main and interaction e ects.Only the main e ects "agent strategies" (. ) and "number of agents" (. ) have a higher e ect size.This shows that changing the market mechanism is a decision that considerably a ects the number of trades.The agent strategies and the number of agents result in an even higher e ect size due to the increased number of trades in the presence of certain traders, for example, random traders.However, the higher number of trades triggered by random traders might not necessarily improve accuracy.Finally, the R 2 is relatively high at .
and outlines that the influence of random e ects is rather low and that it is easy to influence the number of trades by intentionally adapting the identified factors, for example, by choosing the LMSR instead of the CDA.

Accuracy level .
In this subsection, the e ects on the accuracy level (hypothesis incl.sub-hypothesis) are analyzed.Hypothesis has to be declined, because the expected advantage of the LMSR is not consistent (see Table ).Like in Table , the first column shows the averages of runs per mechanism in the default setting.The default setting replicated from the laboratory experiment results in an advantage for the LMSR with a mean accuracy level of .vs.
. for the CDA and, thus, supports the hypothesis.The second column gives the averages of , runs per mechanism over all possible combinations and suggests a di erent outcome.The mean over all settings documents a slight advantage for the CDA.The high number of simulation runs makes this result statistically significant too, but the di erence of less than % can be considered as small.The third column compares the averages of runs per mechanism in each of the , factorial combinations.This comparison further underlines the lack of a clear direction because both mechanisms appear superior in about the same number of settings.An exact binomial test shows that the di erence is not statistically significant assuming each of the mechanisms to be superior with a probability of %.The results indicate the need for a more detailed analysis.

Default setting
All settings Note: Welch two sided t-test for means and exact binomial test assuming a probability of % for winning settings ( * p < .05** p < .01*** p < .001).

Mean trades
Table : Mean and winning settings with fixed factor levels and independent variable "accuracy level". .
To guide this analysis we again conducted an ANOVA, which helped us to identify important interaction e ects (Lorscheid et al. ).Among the most important interaction e ects denoted in Table is the "market mechanism".Therefore, the interaction e ects of the mechanism on the accuracy level have to be analyzed too in order to gain a comprehensive understanding.Interestingly, the main e ect is relatively small and the influence of the market mechanism is only observed in relation with other e ects.The interaction e ect with the information distribution is much higher, for example.Other e ects result from an interaction with the initial endowment of money and the simulation step.However, the information distribution is the most important e ect and more than ten times larger than the highest e ect size including the "market mechanism".The main e ect "agent strategy" represents the second most important e ect a er the information distribution .While the N-ZI traders are achieving the best results with an average accuracy level of .over all settings, the prediction market with the Mix model is significantly worse with an average accuracy level of . .ZI.EV traders (average accuracy level of .) and the Mix model ( .) are nearby the result of the N-ZI traders.ZIP traders are ranging in between ( .).This also documents the relevance of a non-arbitrary selection of strategies, for example, by comparing them with experimental data.
Table : The main (diagonal) and interaction e ects with independent variable "accuracy level".

.
With the independent variable "accuracy level", the R 2 is lower than the R 2 regarding the number of trades indicating a higher impact of randomness on the accuracy level.Therefore, influencing accuracy seems to be less straightforward than influencing the number of trades.The information distribution is not only an important but also a less complex decisive factor.One could argue that the information distribution of Hanson et al. ( ) is extreme and that other information distributions might impact the result less.Nevertheless, the main e ect caused by the information distribution can be seen as dominating the other e ects.Therefore, the hypothesis will be further tested for other information distributions in the robustness subsection.

.
The same three comparisons are made again to analyze the sub-hypotheses to analyze some of the interaction e ects in more detail.The mean accuracy level in the default value is compared between the two mechanisms.Both, the mean accuracy level over all settings and the mean accuracy level for each setting of the full factorial design is assessed (see Table ).One of the values is fixed in Table : Mean and winning settings with fixed factor levels and independent variable 'accuracy level".
. Hypothesis a is supported and states that the LMSR is superior in the presence of fully random traders (ZIU).
Table shows that the mean accuracy level both in the default and over all settings is significantly better for the LMSR assuming a strategy distribution Mix (with ZI.EV, N-ZI and ZIU traders).This result holds for .% of the settings.The stability of the LMSR could explain the improved accuracy level because ZIU traders potentially lead to bigger changes in the CDA.The LMSR does not prevent wrong trades, but lowers the maximum influence of each trade on the price.
. Hypothesis b is also supported and claims that the CDA is superior for extreme information distributions, i.e. information distributions at the border of the possible states.As documented in Table , the CDA is superior to the LMSR for a correct value of and respectively .Over all settings, the CDA achieves a better result in more than % of the cases.The importance of the initial liquidity in the LMSR can be seen as a reason for this.The money is always owned by the traders in the CDA while both the market maker and the traders own money in the LMSR.For example, the average amount of money held by the traders is always equal to the initial endowment of with a correct value of a er steps in the default model based on the CDA.This money is not distributed equally between the traders; however, most traders are still able to participate in trading.The traders hold on average .monetary units (information: correct value is not ) and .respectively (information: correct value is not ) a er steps.Therefore, both groups have on average more than monetary units per trader and, therefore, are theoretically able to push the price up to .The LMSR yields di erent outcomes.The market maker is holding part of the money while the traders hold additional stocks.Therefore, the average amount of money is reduced to .(information: correct value is not ) and .respectively (information: correct value is not ).Having an average of .monetary units already detains the average trader in the second group from increasing the price to .Taking into account, that the minimum final endowment of monetary units in the LMSR mostly prevents the agent from trading, shows that the CDA has an advantage with extreme information distributions of keeping the money in the market.With an initial endowment of , the disadvantage of LMSR is reduced.The CDA only slightly profits from more liquidity (CV : .and CV : . ), but the average accuracy level with the LMSR is clearly improved (CV : .and CV .) compared to the lower initial endowment as presented in Table .The better accuracy can be explained by an increased average money endowment at the end of the trading.While the traders in the CDA have an average amount of monetary units, it is again lower in LMSR ( .).However, even the traders who can exclude the value still have an average endowment of ., which is su icient to push the price up to .Therefore, the disadvantage of LMSR compared to CDA with extreme information distributions is decreasing with more money, but still available.

Standard deviation of the price
.
The e ects of the market mechanism on the standard deviation of the price (hypothesis ) are analyzed next.
There is strong support for hypothesis .The first column of Table reports the standard deviation of runs per mechanism in the default setting.It documents that the standard deviation of the price in the default setting is lower in the LMSR.The second column shows the standard deviation of , runs per mechanism over all possible combinations and presents data favoring the LMSR.The third column compares the averages of runs per mechanism in each of the , factorial combinations.The hypothesis holds for , out of , ( .%) cases, which is the highest value of the three comparisons.All cases showing an advantage for the CDA belong to the ZIP cases.Therefore, the learning capabilities of the ZIP traders seem to lower the standard deviation within the CDA.However, .% of the ZIP cases still have a lower standard deviation with the LMSR and, thus, the LMSR is superior with ZIP traders too.This is caused by the LMSR's ability to reduce the size of the price changes (maximum only instead of ) by determining the price based on all past trades instead of only the last one.Note: Exact binomial test assuming a probability of % for winning settings ( *** p < .001).

Default setting
Table : Mean and winning settings with independent variable "standard deviation of price".

Robustness tests .
Two further robustness tests seem appropriate to support our results .First, we execute a robustness test with di erent information distributions, i.e. less extreme distributions than in the original laboratory experiment.
The main e ect caused by the information distribution was the biggest e ect in the prior analysis.Therefore, a robustness test is required to investigate the reliability of the presented simulation results.Second, extreme factor levels are considered.The factor levels have been selected linearly and, consequently, extreme factor levels at the border of the possibilities have not been tested.Because the results could di er in this context, a robustness test is performed.
. Four alternative information distributions are chosen for the first robustness test.The first scenario reflects a higher diversity of information among the traders and is adapted from an information distribution used by Smith ( ).Each agent has a unique expected value.The expected values of the six agents are as follows: agent : , agent : , agent : , agent : , agent : , agent : .The correct value is assumed to be .The second information distribution mirrors the first one with an expected value of .The third (fourth) is adapted from Oprea et al. ( ) and provides % of the agents with an expected value of ( ) and the other % of the agents with an expected value of ( ).The correct value is assumed to be ( ).

.
The second robustness test targets extreme factor levels as an alternative to the linearly chosen ratio-scaled factor levels.Here, very low factors levels are selected, for example, concerning the number of traders ( instead of agents) and the trading duration ( instead of steps).
. As Table shows, the robustness tests provide further support for the findings presented in previous subsections: Hypothesis and are supported again as both robustness tests are pointing into the same direction.The results regarding hypothesis are ambiguous.With alternative information distributions, the LMSR leads to an improved accuracy, while with extreme factor level values the result is contrary.Consistently with the results over all settings, the extreme information distributions decrease accuracy with LMSR.Therefore, the results are again not robust as in the results of the main analysis.The results regarding the interaction e ect with the Mix strategy combination cannot be tested with two agents, as it is impossible to ascribe the three di erent strategies of the Mix model in the proportion needed.However, it finds further support as well with di erent information distributions because the direction is consistent with the prior tests.

Discussion and Conclusion
. This paper analyzes the e ect of two important market mechanisms on several outcome variables of a prediction market.These are the number of trades, the accuracy level and the standard deviation.
. We find clear support for the first and third hypothesis that the LMSR results in a higher number of trades and higher reliability.These results are relevant because trades contain the information of the actors on a prediction market and the reliability is important for the use of prediction market results.Before the simulation analysis, it has been unclear whether the LMSR achieves more trades from the o erings of a market maker or if the CDA is advantageous.The simulation shows that the LMSR leads to a higher number of trades.This observation stems from the fact that the existence of a market maker typically increases the number of trades.E.g., two o ers can result in a maximum of one trade in the CDA, while they allow for two trades in the LMSR.Consequently, the LMSR only shows fewer trades under exceptional circumstances.It could happen if the extent of the price movements permitted by the LMSR was too big to cover all possible o ers, for example.This might be the case for ZIP learning agents that negotiate a very small price corridor.However, these cases do not outweigh the e ect of the market maker.The fact that the third hypothesis is supported, helped to resolve the tension between the observations of Healy et al. ( , p. ) and contradicting theoretical considerations.The lower volatility observed in the simulation is caused by the fact that all past trades inform the price calculation in the LMSR.The price changes in the LMSR are less volatile and, therefore, less probable due to noise trading.The maximum price change in the LMSR standard setting is compared to in the CDA.Consequently, the LMSR is able to address problems, which are highly relevant in the corporate context.It lowers the liquidity problem and increases the reliability of a prediction market.

.
Interestingly, the accuracy level, the most important evaluation criterion, is influenced in a much less straightforward way.Existing laboratory experiments have also been unclear in this regard (Healy et al. ; Ledyard et al. , e.g.,).In line with empirical evidence, the simulation does not give a clear direction regarding the main e ect at a first glance.However, a more detailed analysis considering interaction e ects with several factors shows that these are actually more important than the main e ect of the choice of the market mechanism.In addition, these interaction e ects move the accuracy level in di erent directions.An important one concerns the strategy of the agent.While adding random traders to a prediction market can generally decrease the accuracy, it decreases less when using the LMSR.The information distribution poses another important interaction e ect.The LMSR's advantages for the default distribution and a correct value in the middle of the possible values are linked to problems with the extreme values.Liquidity issues of traders represent one reason preventing traders from further participation.Consequently, if extreme information distributions are not expected, the LMSR is superior in most of the cases.

.
Overall, the mechanism choice clearly matters when drawing on the "wisdom of the crowds" (Surowiecki ) in a prediction market and tradeo s have to be considered when setting up such a market.There is not "the best market mechanism" for all analyzed settings, but each mechanism has shown specific advantages in a selection of di erent settings.Based on our simulations, the LMSR seems to be advantageous in many cases and limiting the traders' options in the LMSR appears to enhance the results.The advantage of the CDA's flexibility remains for the presence of higher sophisticated traders and extreme information distributions when the prediction market owner has the option to select the CDA to o er more freedom to the traders risking "no trades".In most cases, especially in the presence of traders not trained in market trading which is o en the case in corporate prediction markets, this full flexibility might turn out to be a disadvantage compared to the steady liquidity and cumulative price building process of the LMSR.Excluding prediction markets with well-trained traders, e.g., in the financial industry, the LMSR appears to be an appropriate choice for most corporate prediction markets.

.
Our results do extend prior research in several ways.Our simulation builds on the laboratory experiment of Hanson et al. ( ), but we analyze di erent mechanisms instead of manipulation.Existing laboratory experiments (Healy et al. ; Ledyard et al. ) consider only a relatively small selection of factors.They focus on the information distribution and the mechanism, for example.This work assesses a broader number of factors and tests the results for robustness.Furthermore, it is not restricted to a small number of agents ( agents) and can control their strategies.Existing findings regarding the main e ect of the market mechanism are partly contradictory (Healy et al. ; Ledyard et al. ).Our analysis shows that the main e ect of the market mechanism is rather low compared to the interaction e ects which might explain the di erent results of prior studies.Slamka et al. ( ) have analyzed di erent market maker mechanisms by simulation.They conclude that the LMSR is more accurate but less robust than other market maker mechanisms.This paper compares it with the CDA, and shows that the CDA is even less robust and in some settings more accurate than the LMSR.So, the LMSR balances the advantages of the CDA and other market maker mechanisms, for example, as the one of the Hollywood Stock Exchange covered in Slamka et al. ( ), and could be seen as compromise between them.
. However, open questions remain which might guide further laboratory experiments.Especially, the influence of interaction e ects on the quality of prediction market results could be further investigated due to its high relevance.Existing experimental research is contradictory (Healy et al. ; Ledyard et al. ).The simulation comes to the result that the main e ect of the market mechanism does not have a consistent e ect on the outcome of prediction markets.Instead, the interaction e ects are more important.This leads to two possible directions in subsequent laboratory research.First, the simulation has shown that random traders in CDA can be a major influence which makes appropriate training for participants especially valuable.However, a simulation cannot analyze the exact influence of an appropriate training on the behavior of the actors.For example, training the di erent strategies might lead to participants selecting the most promising strategy.However, they might also try to adapt the strategies with potentially negative e ects.It would be interesting to investigate this possible e ect in future laboratory experiments and incorporate findings in the presented simulation model.Second, extreme information distributions could be tested with the LMSR.In this case, more than one outcome is possible (Klingert & Meyer b, p. ).If the actors would constantly follow their trading strategy like the agents in the simulation, the results might support the findings of the simulation.Then prediction market owners should be cautious to use LMSR when extreme outcomes are expected or the available information o en changes significantly, for example, at the beginning of a financial crisis.If the actors would adapt their strategy to conjointly move the price upwards, the results might di er and the disadvantage of LMSR would be lower than observed in this study.
. The LMSR o ers advantages over the CDA for practical applications within corporations similar to the ones described for HP (Chen & Plott ).E.g., HP actually involved five subjects from HP Labs (Chen & Plott , p. ) to overcome the liquidity issue of small corporate internal markets before the invention of the LMSR.An automated LMSR would not have required such a time-consuming commitment.We outlined that participants without training in market trading might pose an issue.That is why HP considered individual instruction sessions lasting between and minutes (Chen & Plott , p. ).Apart from that, new participants might act randomly until they have learned and got accustomed to the market.Our simulation showed that the LMSR is less a ected by isolated random traders.While the CDA has shown more flexibility regarding extreme outcomes, it seems a minor issue in the case of HP because prediction markets can be well designed and calibrated with the last actual value for the short-term forecast of months and quarters.Only very few exceptions like the first period of a major financial crisis exist that would see the CDA at an advantage.

.
Beside prediction markets, this work also contributes to simulation research in general by providing a concrete example for using experimental data to validate simulation models on the micro level.Validation is a major problem of simulation and simulation models are o en at most partially validated (Heath et al. ).When it comes to statistical validation, a complete validation is even more rare (Heath et al. ).In both, field experiments and laboratory experiments we cannot investigate the minds of the participants.However, the known information in laboratory experiments which is given by the experimenter allows to better draw conclusions about strategies which best fit to the observed actions.Using this advantage of a combination with laboratory experiments, the described simulation model is statistically validated on the micro level and tested on the macro level.Therewith, it also di ers to prior prediction market simulations (e.g., Oprea et (Hanson et al. ) as well as analyzing prediction markets with only a short duration represented by only few simulation steps.Another area for further research arises from the fact that the mechanisms are compared under the assumption of constant agent strategies.Choosing the best mechanism should also be based on how easy it is to comprehend because a lack of understanding by human traders might decrease the quality of the strategies.Market design research highlights simplicity as a major lesson learned which is also recognized by Roth ( , p. ).The LMSR has less degrees of freedom in how to trade because only the market maker is allowed to place o ers, and is, thus, less complex than the CDA.Simplifying the options and strategies could lead to a simplified strategy choice.This e ect might ultimately result in better strategies, which should be explored further in subsequent laboratory and simulation experiments.
Strategies that are independent of the state of the simulation can be described as random.The literature has introduced zero intelligence unconstraint traders (Gode & Sunder , p. ), which sell and buy at any possible value determined by a uniform distribution.This strategy defines the validation baseline because it includes every action and is therefore minimally restrictive.In addition, random trading might pose a valid strategy for laboratory experiments and might even be meaningful in some cases (Fama , p. ).However, random trading seems to be the most basic strategy of the ZI strategy family.It is also supposed to be less successful in most environments compared to other zero intelligence strategies.
Strategies using the state of the agent are categorized as fundamental trading in the literature (e.g., Alfarano et al. , p. , Farmer & Joshi , p. .In our simulation, a simple strategy is an adaptation of ZIC traders, i.e. zero intelligence expected value traders (ZI.EV) selling above and below the expected value .The adaptation of ZIC is necessary because the true ZIC traders can only either buy or sell (Gode & Sunder , p. ).The expected value is calculated as the sum of all possible payo s per states weighted with the probability known to the actors .
Strategies using the state of the market can be further broken down into (a) strategies trying to extrapolate past states and (b) strategies basing predictions on previous prices.(a) The former is described as trend trading (e.g., Farmer & Joshi , p. ) or herd trading (e.g., Alfarano et al. , p. ).The simplest strategy is a zero intelligence trend (ZI.trend) strategy buying below and selling above the trend price.It arrives at the trend price by extrapolating the last two prices and results in a predicted trend value that is the last price plus the di erence between the last two prices .
(b) Finally, a strategy based on the market price is derived.An example can be found in literature like trading based on the last price(s).Du y & Ünver ( ) develop a strategy that contains such a component (beside a fundamental trading component).The simplest strategy would be a zero intelligence last value (ZI.last) strategy which buys below and sells above the last price.This strategy is also consistent with random walk theory (Fama , p. ), according to which the best prediction of the next price is the last price.Therefore, an actor might consider trading on the last price as a strategy .
( . ) Next, the actors are assigned to the predefined classes in three steps.First, every action of each agent can be classified as compliant with a certain strategy or not compliant .For instance, if a trader is selling for a price of monetary units and the actor's expected value is , the action is not compliant with the ZI.EV strategy.However, it might be compliant with other strategies.Unlike other empirical data, laboratory data allow for this classification, because the information of the actors is controlled by the experimenter.Second, the actor is classified into the strategy with which most of the actor's actions conform.Third, the classification is tested against the null hypothesis.If the null hypothesis can be rejected, it remains in this class.Otherwise, it is re-classified into the class linked to the validation null hypothesis.
The validation null hypothesis of the classification is the ZIU strategy and leads to completely random actions by agents.The ZIU agents can sell and buy within a range from to .Such an agent has an average compliance rate of about .% with each rule .On this basis, the validation hypotheses are tested against the validation null hypothesis.
The actors in the experiment of Hanson et al. ( ) are assigned to the pure strategies documented in Table .With -traders the majority is classified as ZI.EV traders; only -actors belong to ZI.trend or ZI.last.The validation baseline is not rejected against any of the validation hypotheses in cases.Low correlation coefficients for the strategy compliance underline the diversity of the non-combinatorial strategies .Furthermore, it has to be noted that the traders classified to the ZIU strategy are not necessarily trading randomly.In the case of a significance level of < ., only traders would not have been classified into to the classes ZI.EV, ZI.trend or ZI.last.These results lead to using the ZI.EV strategy as the default model for the agents in our simulation.In addition, ZIU will serve as a benchmark model.The other two strategies are not used, but subsequently combined with the ZI.EV strategy for the combinatorial strategies.strategy.Therefore, the heterogeneous model Mix can be seen as a "lower boundary" of random traders.The heterogeneous model Mix contains the two strategies derived from the validation and the ZIU strategy given that the ZIU strategy could not be rejected for actors in the combinatorial case.Therefore, it represents the idea of an "upper border" of random traders.
This work is a revised and substantially extended version of Klingert & Meyer ( a) presented at ECMS in Koblenz and elaborates on the PhD thesis of one of the authors (Klingert ).It provides additionally to the ECMS contribution an extensive description of the micro validation and investigates further hypotheses.Beyond, further feedback is considered, among others of the participants at ECMS .
The reason why companies set up such markets is that concerning the information they want to generate typically no usable market exists.This is also o en the case for prediction markets used in other contexts.Still, o en the ability of markets to aggregate dispersed information described by Hayek can sometimes be used by looking at already existing (future) markets (for examples see Surowiecki ). E.g., the expected outcome of elections and possible changes in this respect might also be reflected in the stock market of a country.A practical challenge of prediction markets is possible susceptibility to manipulation (Hansen et al. ; Hanson et al. ).
A notable exception in this respect is the work by Brahma et al. .
The talk by Ledyard is not published, but as empirical insight regarding the di erence of CDA and LMSR is rare, we cite here the paper in which his talk is mentioned.
Even the number for the average prediction market in a corporate context is higher despite the fact that it generally involves fewer traders than other prediction markets.
The variance of the price from the correct value is used as a measure for the accuracy level (Hanson et al. , p. ).Another related criterion is the time a mechanism needs to incorporate new information.This is not in the focus of this paper.Concerning this and related issues the paper by Brahma et al. ( ) can be seen as a very useful complementary contribution comparing an refined version of the LMSR and their Bayesian Market Maker.See also the next endnote for the other metrics used in their paper.

Brahma et al. (
) use di erent metrics due to the di erent simulation targets and the resulting research questions, which are partially overlapping.Both papers measure the accuracy of the prediction market and therefore employ very similar measures (we follow Hanson et al. ( ) and measure accuracy as the variance between the price and the correct value, Brahma et al. ( ) use the RMSD which is the root mean square deviation).Both papers also address the reliability of the market towards the end of trading.We use the standard deviation of the last prices, while Brahma et al. ( ) look at RMSDeq.This di erence is again driven by the different foci of research.We want to assess the risk of making wrong decisions based on "extreme" predictions.

Brahma et al. (
) uses the RMSDeq to address a specific shortcoming of the LMSR, i.e. that it tends to fluctuate even a er reaching the true value.This property of the LMSR is not in the focus of our research, as the CDA is known to be much more volatile than the LMSR.We measure the number of trades, as this gives an indication for the liquidity in the market and its related ability to aggregate information.Brahma et al. ( ) measure the bid-ask-spread.This gives also an indication for the liquidity in the market, as a higher spread also implies a lower liquidity.At the same time the authors consider this as a measure for the convergence of the respective market maker's beliefs.Given our di ering research focus, adding these metrics would not add enough value compared to the cost of increasing the complexity of this paper.For a discussion of their results related to the ones of our paper see Section .
The program code and explanations concerning it will be provided by the authors upon request.

The knowledge distribution chosen by Hanson et al. (
) is partially certain and might appear rather specific.Nevertheless, there are circumstances where similar information distributions could exist in reality.Let us think about an IT-consulting company predicting its revenue figures with a corporate prediction market.Maybe a division of this company is mainly dependent on two projects: A Blueprint project with a volume of Mio.Euros and a dependent realization project with a volume of Mio.Euros.The possible outcomes in this case would be Euros, if no project will be executed, Mio.Euros if the first one is executed and Mio.Euros if both projects are executed.In this case, an employee might have certain knowledge about an approved Blueprint project which will be executed for sure without knowing more details about the second project.
It is assumed that the actors in the laboratory experiment have comparable reaction times.This is an important di erence to other studies also referring to experiments.For example, the focus of Hommes ( ) is to combine experiments and simulation to show that their models can be fitted to their experimental data and helps to explain some observed stylized facts.Still, there are also other studies which try to go beyond the original settings (Bravo et al. , see).A crucial issue when doing so is to consider whether the behavioral strategies derived from the original laboratory setting still hold for the new settings derived.
The parameter b bounds the loss of the market maker, but also influences how adaptive the market maker is and the liquidity in the market.Such a market maker who is very adaptive results in large bid-ask spreads.This reduces the liquidity in the market (Brahma et al.
Beside the initial endowment of money, the CDA may have additional costs due to the initial endowment of stocks, if the correct value is or .Then each stock pays the trader or monetary units.In LMSR, the traders do not receive an initial endowment of stocks and do not get this additional payo .However, they receive an additional payo by the loss of the market maker.The b-value regulates this loss.When choosing a level for b a tradeo exists between not allowing too much possible loss (which should be in our setting comparable to the costs of the CDA) and a liquid market with low fluctuations around the equilibrium (Brahma et al. ).Therefore, when setting the parameter b, we have experimented with b and fixed it at a su iciently high level.Consistent with these sensitivity analyses, we fixed the value for b as .in our experiments as a control variable.
As zero intelligence traders are bounded by simple rules, they should not be considered as completely random.The only exception is the strategy ZIU.
The micro level refers to the actions of the agents in the simulation and the actors in the laboratory experiment, for example.
To the knowledge of the authors, they have not been statistically validated in subsequent papers either.This is not very surprising because zero intelligence agents were originally not designed to represent human behavior.They have been designed to show that even very simple strategies at the micro level can result in patterns typically observed at markets at the macro level.

Brahma et al. (
, pp. -) also investigate the e ects of di erent trading strategies in their comparative analysis.They distinguish also between several types of technical and fundamental traders, but do not provide an empirical validation of the trading strategies.Hommes ( ) provides an interesting example of empirically validated behavior in asset markets, making a claim for the heterogeneous expectation hypothesis.In some of the settings, he also includes fundamental traders, but overall he suggests a behavioral model that is able to switch between di erent forecasting heuristics.It should be mentioned that in his settings agents only forecast prices and do not buy stocks and that his study has a di erent scope.
The problem of overfitting simulation models to historical data has already been discussed in the literature (Law ).
Each strategy should be represented by an even number because the agents can have two di erent knowledge components, i.e. one agent has an expected value above and one below the true value.An even number of agents ensures an appropriate separation between knowledge and strategy to analyze the interaction e ect.Two more restrictions have to be taken into account.Only a natural number of agents is possible and the distribution should be valid for agents because the experimental design starts with agents and tries to test a linear relationship of , and agents in a k-design of experiment.
For example, CV means, that % ( %) of the agents know, that the correct value is not ( ).
Simulating the default setting , times, the variance increasingly stabilizes (see Lorscheid et al. , pp. -).Due to computing time restrictions and a clear decrease of instability a er the first runs, runs per factor combination are performed.The high significance levels of our statistical analyses indicate that the simulation model behavior is su iciently stable.) The average amount of trades based on the ZIP strategy in the default setting is .(CDA) and .(LMSR) respectively.Over all settings the di erence is even smaller with an average of .(CDA) and .(LMSR).

Brahma et al. (
, , p. ) also investigated the e ects of trading strategies and their results supported their relevance too.The overall trend in their simulation experiments is that with more fundamental traders the accuracy gets better (which makes sense as the noise in the market is reduced).
The maximum deviation of the median is only .monetary units.
Each strategy should be represented by an even number because the agents can have two di erent knowledge components, i.e. one agent has an expected value above and one below the true value.An even number of agents ensures an appropriate separation between knowledge and strategy to analyze the interaction e ect.Two more restrictions have to be taken into account.Only a natural number of agents is possible and the distribution should be valid for agents because the experimental design starts with agents and tries to test a linear relationship of , and agents in a k-design of experiment.
Beyond, we checked several other settings not described here.For example, the simulation is relatively stable concerning its main results with a higher number of agents not simulated in our experiments (e.g., or number of agents) comparing CDA and LMSR in the standard setting.
The term ZI agents stems from the fact that the rules constraining an agent's budget were not set by the agent but by the market in the original paper (Gode & Sunder ).The agent behaves randomly within this constraint.Last value = price t Note: Accepting an ask for a price of x is considered as buying at a price of x.Accepting a bid for a price of x is considered as selling at a price of x.
Note: Accepting an ask for a price of x is considered as buying at a price of x.Accepting a bid for a price of x is considered as selling at a price of x.
The agents in the four experiments with replication treatment are not analyzed separately in order to reduce the complexity of the validation.Separating the experiments, the ZI.EV strategy is still the dominant strategy with at least actors in the ZI.EV class in each of the experiments.

.
Prediction markets have been applied successfully both in research (Camerer et al. ; Dreber et al. ; Forsythe et al.

Figure :
Figure : Exchange of stocks and monetary units between two traders with CDA.
First, the model setting is almost identical to the laboratory experiment of Hanson et al. ( ) as discussed in the previous section.Second, the experimental micro data from Hanson et al. ( ) informed the strategy selection .Third, a macro validation is performed.Gode & Sunder ( ) already compared the results of ZI-agents with data from laboratory experiments at the macro level in their foundational publication.Adding to that comparison, a macro validation based on the experimental data from Hanson et al. ( ) will be discussed briefly. .The experimental micro data from the laboratory experiment of Hanson et al. (

Figure :
Figure : Exemplary simulation based on default model, for example, a correct value of .

Figure :
Figure : Box plots of exemplary simulation runs, with CDA and with LMSR, which are based on the default model with a correct value of .

P
artial η 2 = SStreatment (SStreatment+SSerror) (Pierce et al. Strategy: State of the simulation [x Random] → Action (Woolridge , p. -) This strategy follows the spirit of ZI-agents.For example, the component of N-ZI traders (Du y & Ünver ) which represents fundamental trading is similar to the ZI.EV traders.Expected value = Number of states s=1 payo s * probability s Trend value = price t + (price t − price t−1 )

Table :
Factor levels in experimental design.Note: Default values taken from the laboratory experiment are shown in bold. .As the model is predominantly validated based on the experiment of Hanson et al. (

strate- gies Initial money
Table to cover the sub-hypothesis and the analysis therewith di ers from the analysis of the default model.For example, the fixed value "Strategy: Mix " means that the Mix strategy combination is considered instead of the ZI.EV traders to analyze hypothesis a.

Default All settings Winning settings Alt. Knowl. Extr. values
Note: -contrary to hypothesis; o neutral; + support for hypothesis; NA not applicable.Table : Summary of all results including robustness tests based on alternative information distributions and extreme values.
Finally, several limitations of our research have to be considered.The validity of simulation presents a limitation.Therefore, it would be worthwhile to further validate the simulation results with subsequent laboratory and field experiments.The strong link of our simulation to the experiment ofHanson et al. () makes additional variations beyond the robustness tests desirable.Examples include the introduction of manipulating agents as a major concern in prediction market research