* Abstract

Massively Multiplayer Online Games (MMOGs), in their aspect as online communities, represent an exciting opportunity for studying social and behavioral models. For that purpose we have developed Cosmopolis, an MMOG designed to appeal to a wide variety of player types, and containing several key research-oriented features. The course of development has revealed several challenges in integrating behavioral models with an MMOG test bed. However, the Human Social, Cultural, and Behavioral (HSCB) research value of Cosmopolis has been demonstrated with a number of prototype studies, and based on these studies and challenges we propose an ongoing experimental plan largely driven by collaboration with HSCB researchers.

Social, Behavioral, Modeling, Game, Multiplayer

* Motivation and Overview

A 2008 study by the National Research Council entitled "Behavioral Modeling and Simulation - from Individuals to Societies" (NRC 2008) discusses how we need to expand research in modeling and simulation to include models of individual and societal behaviors. In the study, it is pointed out that a technological infrastructure needs to be developed for behavioral modeling such that we can properly develop, test and then deploy such models. The study, in fact, suggests the development of a massively multiplayer online game (MMOG) for that infrastructure. Such an MMOG can be utilized as a test bed for models of individual and group phenomena.

Cosmopolis is an MMOG we have developed for this purpose. In designing the game, we have been motivated by the need to balance the diverse interests of players and researchers: players want an engaging game experience, while Human Social, Cultural, and Behavioral (HSCB) researchers need the flexibility to perform various experiments of their own design. To accomplish these goals, we've designed the game to incorporate specific features to work towards these ends.

For players, comprising the general online gaming public, Cosmopolis is an MMOG built around an outer game world and a collection of sub-games of potentially any genre (action, puzzles, sports, etc.). For researchers, Cosmopolis is a unique test bed and data source for studying social and behavioral models, particularly via custom-designed experimental subgames. The models can be of individual players or multiple players over time, as well as of non-player (AI/Artificial Intelligence) characters (NPCs), or combinations of the above. Cosmopolis provides various and flexible methods to facilitate these needs. Cosmopolis also has a novel approach to information channels, with multiple real-world and game world sources being combined to create effects on the game AI, and customized output for players. See Appendix F for online Cosmopolis video links; annotated code samples are available on request from the authors.

Section 2 of this paper will discuss the theoretical framework of Cosmopolis from game design and social/behavioral modeling perspectives. Section 3 will cover the approach to development of the game engine, in terms of both design and engineering. In Sections 4, 5, and 6, the first set of experiments in Cosmopolis will be described and their results analyzed. Section 7 will largely be devoted to future work on Cosmopolis, particularly relating to challenges discovered during development, and a formal plan for upcoming social/behavioral modeling experiments. Section 8 will summarize and draw conclusions from our findings during the Cosmopolis project.

* Theoretical Framework

Research into video games and their scientific uses has currently taken several paths. Work that has taken a Human-Computer Interaction (HCI) perspective has looked at issues of real world reaction to virtual appearance, (Yee and Bailenson 2007). Other researchers have begun investigating the broader issue of how real world social phenomena translate into virtual spaces. Castronova specifically proposed that virtual worlds might serve as ideal platforms for experimenting with a wide variety of individual, organizational, and societal (IOS) models (Castronova 2005). To demonstrate this point, he carried out a small scale experiment demonstrating that the real world concepts of supply and demand mapped reasonably to a virtual space (Castronova 2008).

A variety of researchers have also looked at the social structures that form in games and their relative strengths and weaknesses. Such analyses have been derived from qualitative and ethnographic observation of player interactions, from surveys of player opinions, and from social network analyses of the strength of ties between different players. One notable ethnography-based analysis is Pearce's long term study of "Uruvian expatriates" (players of Myst: Uru Online who migrated to There.com after the first game's abrupt closure), and the roles that emerged among them. (Pearce and Artemesia 2009). Social network analyses have been specifically conducted using both Everquest and World of Warcraft, (Ducheneaut et al. 2006, Huang et al. 2009, Huffaker et al. 2009,Williams et al. 2006) demonstrating the relatively small levels of interaction among players within the same guild structures, while Johnson et al. have developed a model of guild formation patterns that also helps to explain the formation patterns of offline gangs (Johnson et al. 2009).

All of these efforts fall into the general category of mapping, as described by Williams: researchers want to know how virtual actions and representations serve as analogs of real world actions and representations (Williams 2009). By implementing a new MMOG, as opposed to relying on working in the diversity of extant MMOGs and virtual spaces, we can establish a unified mapping environment in which a variety of different phenomena can be explored, linked across a single space. Then, as our understanding of mapping principles develops, we will be able to implement and test different mappings with autonomy unavailable to a corporation beholden to a much more fixed game structure.

Mapping is a difficult phenomenon to deal with in MMOG development because it is difficult to predict exactly how different experiences will map for different individuals. That said, game designers have done considerable work to try and understand the differing natures of play styles practiced by different individuals in online environments. Designers have long been aware of the emergent values and behaviors of different MMOG communities and attempted to foster a broader awareness of this fact in the community at large. Raph Koster, one of the designers of Ultima Online and lead designer of Star Wars Galaxies, famously formulated that "[An MMOG is] a community. Not a game. Anyone who says, 'it's just a game' is missing the point" (Koster 2010). Morningstar and Farmer, developers of LucasArts's social game Habitat, encountered the same phenomenon and noted that "a cyberspace is defined more by the interactions among the actors within it than by the technology with which it is implemented" and that from a design standpoint "detailed central planning is impossible" (Farmer and Morningstar 1990). While the rules and incentive structures for certain behaviors can be incorporated into MMOGs, players will be driven by their own motivations as well. Instead of looking at player growth as a process opposed to these rule structures, however, community development should be considered in tandem to them. Players' reactions to the IOS models as implemented in the game environments will help to evolve our understanding of these models themselves.

That said, to apply these ideas to our specific development of an MMOG for the study of individual, organizational, and societal (IOS) models, it is still necessary to develop both an understanding of the community that will play the game and a method for allowing investigators to translate the salient features of IOS models into game dynamics. While measuring players' reactions to model implementations is essential, it is impossible to engage in accurate study without any theory of the base population. Bartle notably broke down players into four types based on discussion within a game's forum about what people want out of a multi-user dungeon or MUD (a precursor of modern MMOGs): Achievers, Explorers, Socializers, and Killers. Bartle posited that these player groups can exist in various stable states of flux, determined by the type of MUD that had been created (Bartle 1996). Yee later followed up this work with an attempt at a multi-factor analysis of the survey results from players of different MMOGs, identifying three salient factors in players' motivations for play: Achievement, Socialization, and Immersion (into a virtual environment). Yee also noted that these factors did not suppress each other, but might actually coexist within an individual at equal intensities, bolstering each other (Yee 2006). As noted earlier, significant research has already been done to determine the demographics of several conventional MMOGs, though Aschbacher's report on Whyville demonstrates the possibility for considerable demographic variability based on design, a phenomenon also at the heart of Pearce's study (Aschbacher 2003,Pearce and Artemesia 2009). Game designers can appeal to all or some subset of these perspectives via design choices, and in creating Cosmopolis we have sought to provide a framework that would support multiple combinations of desires. Additionally, given our expected ability to segment players based on their play habits and associations, we can hope to provide a more refined breakdown of play habits than has been previously found.

Another important aspect of creating and instrumenting a game, particularly an MMOG, as a tool for social research is that of using competition among players as a means to reduce search complexity over a large-scale problem space. Our "human heuristic hypothesis" is the assumption that competing human players in increasing numbers and at growing levels of expertise will be able to find better solutions in large-scale social scenarios than would brute-force AI methods assigned the same problem. This is particularly true if the conditions discussed above hold in order to keep the players engaged and on task.

* Approach

The first subsection below discusses Cosmopolis in terms of its game design features for players, as influenced by the MMOG design guidelines mentioned in the background section. Key features covered include the outer game / subgame structure, and the game world's information channel system with its novel inclusion of real-world newsfeeds. The second subsection describes the design features of Cosmopolis as a research testbed. Both subsections incorporate examination of the engineering criteria used to build Cosmopolis.

Game design

As a game, Cosmopolis has a two-level structure: outer game and subgames, all free of charge. The outer game is a present-day city- and world-building simulation, including player-level and guild-level conflicts. The various subgames may be housed in any buildings or areas in the world. This two-level format supports our aim of attracting as many players and player types as possible, consequently yielding data about a wide variety of individuals in distinct populations and about the relations among them.

Figure 1. A part of a city in the Cosmopolis outer-world

To attract a wide player base, we aim to appeal to several archetypes of players as described by Bartle (Bartle 1996) and supported by Yee's later re-analysis (Yee 2006): Socializers, Achievers, and Explorers, and Killers. Socializers can freely interact in peaceful venues (i.e., outer game and some subgames), Achievers can compete in subgames or strive for leadership in the outer game, and Explorers can discover new quests and subgames in various distinctive locales. Even Killers can have at each other in combat-oriented subgames. One group we do not wish to cater to is the "Griefers" community, whose primary goal is to cause other players to have a miserable experience.

Besides providing a variety of gameplay genres for the different player archetypes, the subgames have leaderboards and unlockable achievements persistent in the outer game. This is so that outer-game oriented players will be engaged when playing subgames, and vice versa, tying the virtual community closer together.

In terms of engineering, Cosmopolis is designed to be an engaging game, with state-of-the-art graphics and effects, in an easily customizable world model. The game engine has been built from the ground up with the primary objective of supporting a massive, changing world seamlessly for thousands of simultaneous players. By design, Cosmopolis needs to be able to support large cities and wilderness and also to provide a coherent experience across the subgames and the outer game. For example, the current game world has a total area of 8 km2 and was designed with our world editor toolset. It also supports importing real world height data to recreate small islands, etc.

Cosmopolis, being a persistent world, also needs to retain the changes players make to the world. For instance, the engine supports dynamic terrain deformations; a grenade blast that deforms the land will permanently deform the land (unless fixed up by a player). This characteristic of the world gives players an immersive experience when they come back and find continuity from their previous exploits.

All of the action in Cosmopolis is handled by an event-based networking model. Each event has certain properties that define how it will be perceived in the subgame and the outer game. This approach enables us to be flexible with regards to the level of interconnectedness between the subgame and the outer game as desired by the researchers and game designers. Subgames could be designed to be completely isolated sandboxes like MMOG instances such as Tribes, or be seamlessly integrated with the outer world like Skirmish.

Figure 2. One of Cosmopolis' world editing tools - the terrain editor

Cosmopolis is increasingly researcher-driven, and designers can tweak various game parameters without having to ask the engineers to rebuild the game for each change. The extremely modular design of our engine also reduces the burden on the subgame engineers as they need not worry about the integration with the outer world for components like particle systems, sound, character movement and animation, etc.

In Cosmopolis, the in-game information system is a collection of channels through which messages flow. Channels may display news feeds from the real world or commercial advertisements; channels may publish in-game announcements publicly or regionally; channels may be configured as special chat lines between players. We aim to present messages efficiently and effectively to and between players, as well as to support the framework for the study of information spread and analysis.

For example, we maintain an in-game virtual economy system that has a commodity market and currency exchange market. All the commodity prices and currency exchange rates are synchronized periodically with incoming real-world rates. Data extracted from information channels may also change the behaviors of NPCs (implemented as artificial intelligence-driven software agents). Warnings of "terrorist attack" or "earthquake danger" may cause NPCs to flee an area. Rumors of "unrest" may coincide with NPCs behaving in a less friendly manner towards players or each other. Stock market gradients can also change the personalities of the NPCs, e.g. increasing makes them happier, and decreasing makes them nervous.

Information channels can also be customized for specific players and player groups. For instance, a billboard's text may appear in English to some players, but in Arabic to others, depending on the player's selection of primary language.

As Cosmopolis's information system is relatively untested, it is an open question as to all the ways it will affect players. However, some speculation is possible. Players may be influenced by the game world's or NPCs' responses to certain events as described above. For example, NPCs are programmed to respond to earthquake predictions in a certain manner, and the players can learn from NPCs' actions. Another example is that players may choose to move to a region where commodities are becoming relatively more valuable. Also, there may be information channels that show players certain data (such as social networks) from the game, to see if providing such information alters player behavior.

Figure 3. Demining in the Operation: Peace subgame

Currently, there are five completed 3D subgames in Cosmopolis: WarPipe, Operation: Peace, Skirmish, Tribes, and TeamIt. WarPipe is a multiplayer action/shooter game for individuals or teams, featuring a detailed urban battleground and 1st/3rd person camera perspectives. Operation:Peace, a simulation designed for the United Nations Millennium Challenge, involves the detection and clearing of mines from a demilitarized zone. Skirmish is a jetpack based air combat game that takes place in the skies of the outer world. A player may target other players and enemy AI fighters while avoiding buildings and enemy fire. Tribes, commissioned by social scientists at CMU, is a computational model of intertribal relations in pre-referendum (united) Sudan. The goal of the game is to lower national hostility levels while committing the fewest interventions. TeamIt is a location-based game of team cooperation and negotiation, exploring the differences between real and virtual world interactions. All of these subgames are available for the running of further experiments.

Research test bed design

As a research test bed, Cosmopolis offers a critical degree of experimental flexibility beyond the data-logging capability of the standard MMOG. Our overall design comprises a federated model architecture: each subgame is a potential lab for a different social and behavioral model, maintaining interoperability with the outer game world model. Subgames may be added, and gameplay of the outer world can be tweaked, all to meet the needs of researchers who use our game to validate or collect data for their models. While all in-game events will be logged, we will be specifically providing appropriate data export, reporting, and visualization capabilities so that researchers can easily analyze the experiments that they design and conduct in the game environment. Exported data would include player characteristics and activities, relational information such as who played with whom, performance outcomes, geo-temporal activity sequences and so on.

From an engineering standpoint, the subgames' content and logic is completely isolated from the outer game except for a controlled data access pipeline. This enables administrators and researchers to determine exactly what parts of the outer world (or other subgames) that a subgame can modify, to prevent any unexpected behavior. This also means that in the event of bugs or design inconsistencies in a subgame, it can safely be taken down without affecting the rest of the game.

The event-based networking model enables efficient logging management, which is vital for researchers using Cosmopolis as a data source. The logging parameters can vary from player to player, subgame to subgame, based on the needs of various researchers. The current networking model has a separate gameplay and analysis server. The analysis server can be tasked to perform near real-time processing in addition to logging the data.

To separate the account management from gameplay logic, Cosmopolis uses web services to perform authentication and initiate game connection. This also paves the way for enabling researchers to control different parameters of the game from the browser-anything from adding characters to changing the weather.

* WarPipe Experiment Synopsis

The first research experiment on Cosmopolis was performed using the WarPipe action (shooter) subgame. The design team ran the experiment, administered surveys, and logged information about player activities. The planned experiment for WarPipe was intended to look at two research questions: how a particular pre-defined communication pattern alters the ability of players to work effectively as a team in a third-person virtual environment, and how this effectiveness compares to similar constraints when applied in the real-world. The experimental design was to run three variations of WarPipe and study the behavioral norms of individuals who are only able to play one of the three variants, as expressed by the players' actions in the outer game and in their responses to surveys.

Figure 4. A sniper attempting to take out a charging target in the WarPipe subgame

This concept was based on prior military and other investigations into communication, particularly radio communication in urban environments (Bavelas 1950, Christ and Evans 2002,Redden and Blackwell 2001). Relevant game-oriented research was conducted on the America's Army Game: (Carley et al. 2005, Moon et al. 2005, Schneider et al. 2005), as well as other paradigms showing people's reactions in virtual spaces: (Slater and Sadagic 2000, Yee and Bailenson 2007). More game research studied how latency impacts playability of first-person shooters (Claypool et al. 2006), how the transmission of sound within first person shooters affects gameplay experience (Gibbs et al. 2006 ), and the modeling of creative actions & communication as expressiveness in first person shooters (Wright et al. 2002 ). Our hypothesis was that in a third-person, player vs. player shooter game, team effectiveness as a function of communication structure would closely resemble that seen in previous real-world experiments (Redden and Blackwell 2001).

To test this, we set up three different versions of the WarPipe, identical save for the constraints placed on players' abilities to communicate. In the first variant, only top-down communication was permitted save in emergency situations (e.g., when under heavy fire), in which case players could talk to those above or adjacent to themselves. In the second variant, players could communicate freely to people adjacent to and below themselves; in an emergency they could again communicate up the tree. In the third scenario, player communications were entirely unconstrained.

Our dependent variables were team efficiency at completing the various tasks, sense of effectiveness at completing tasks, actual effectiveness of the communication pattern, and sense of effectiveness of the communication pattern. These variables were measured with player surveys (see Appendix C) and data logs.

There were several confounding variables associated with the study. Test subjects were not controlled for prior experience or lack of experience with third- or first-person shooters or for prior relationships outside of the game. Unlike in the live simulation against which we were comparing our work, the planned encounters with the OpFor were not planned in detail.

The game setup was a map within the WarPipe third person shooter. The map was relatively convoluted, calibrated for a game of 10-versus-10 gameplay. Players were grouped in teams of seven, and able to communicate through text chat. Text chat was proximity based, with users given the option of broadcasting to other players outside of earshot under certain communication configurations. We assembled three different play groups of seven players each; this was handled in sequence - only one team was present in the experiment area at a time, and players played in as close to complete isolation as possible.

Initially, team members were administered a survey (see Appendix B); then players were given five minutes to acclimate themselves to the norms of the WarPipe environment, using a map different from the actual gameplay map. Next, players were told that there are between one and five OpFor (Opposing Force) on the map, who should be located and eliminated, and there are between one and ten "McGuffins" on the map, which should be located. The completion of any of these tasks was reported back to the WarPipe staff running the experiment by the designated "Communicator" for the team. Players had a total of 30 minutes to finish the course, but the Communicator can report that they have completed the course prior to this point. As an incentive to efficiency, players were compensated with $1 for each minute under 30 that their team took.

Players were next shown their positions on the communication graph, had their communication mode explained to them (when and to whom they should communicate, and their ability to use both radio and local area chat), and met the other members of their team. Players were given to understand the criticality of the "Communicator" to their success, and made aware of the tree-like organizational hierarchy of their groups. That is, the Communicator directed the two individuals below her, and those two individuals directed the two individuals below them. When playing the level, players were isolated so that that they could not see each other; communication without using in-game methods was prohibited.

The results of the WarPipe experiment were of some value, although the hypothesis of mapping from the initial military experiment to the virtual world game was neither proven nor disproven. The problem with provability stemmed from the different communities used as subjects in the respective experiments. The real-world experiment was undertaken by trained military personnel, facile with the use and rationale of the various communication strategies, and employed them in a rigorous manner. The WarPipe game, on the other hand, was played by USC students whose expertise was largely in the realm of playing shooter-style video games. This group typically uses communication channels for rather different purposes: status updates with friends, "trash-talking" to opponents, and other casual uses. Communication as a tool for problem-solving (i.e., squad triangulation, parallel task assignment, etc.) was somewhat beyond their scope. In the future, mapping or lack thereof could be more tightly observed by having groups of students do a real-world laser-tag version of the WarPipe game. See the "USC Campus Experiment" section below for a planned mapping experiment of this nature.

* Tribes Experiment Synopsis

One of the key ideas in the development of Cosmopolis was to have an AI system underlying the world and providing feedback for player actions. Such an AI would open the world for possible large-scale experiments involving the game's entire population. As a first step towards this, we began work on a basic experiment using player interaction with a computational model to investigate the possibility of crowdsourcing policy development.

The research question underlying Tribes was whether individual players, when presented with a computational model with too many configurations to be completely solved by a computer, could find optimal configurations through reasoning. In the same way that researchers have built custom scenarios in SimCity, accepted the model's assumptions, and analyzed the results (Peschon et al. 1996) we sought to build a model, have individuals play it, and would then analyze their results.

In the case of this game, we developed a model of inter-tribal relationships in Sudan. Our model was coded using Construct (Schreiber et al. 2004), a simulation engine that has been used for a variety of different social simulations. Construct posits that agents interact and exchange facts; these facts have positive or negative associations with particular beliefs. Correspondingly, agents choose to interact with each other based in part on defined affinities and in part based on overall similarity of belief.

Figure 5. Player comparing tribal beliefs in the Sudan Tribes subgame

We modeled Sudan as a set of 14 prominent tribes, each with a different degree of affinity to the others, and each tribe with a leader. The tribal prominence and affinity data is based on a semi-automated analysis of the corpus of the Sudan Tribune newspaper over the period of 2003-2010. The newspaper transcripts are reduced to a pre-defined list of relevant entities that are then linked together based on their relative proximity. Our data thus assumes closer relationships between tribes discussed in the same context and more distant relationships between those that are not mentioned in the same context. We correspondingly assume that the most important Sudanese tribes are those mentioned most often in the paper. The process of vetting and analyzing this data was not part of our own research and is more thoroughly covered in previous work (VanHolt and Johnson 2011). Tables describing the key components of the model can all be found in Appendix E at the end of the paper, as can the pseudocode governing the model's processes. Model code and run scripts are also linked here.

The Sudanese are not only defined in terms of tribe membership but also as possessing a set of eight beliefs, each consisting of ten facts - five positive and five negative. The set of beliefs and their distribution across the different tribes were derived from selected readings on Sudan and consultation with a Subject Matter Expert (SME) on Sudan (Carley 2010, ICG Sudan 2007, Johnson 2006). We posit that a central cause of tension in Sudan to be that different tribes possess markedly different beliefs, and that to rectify this it is important to bring the tribes' beliefs into alignment. We measure hostility by considering the net difference in average beliefs between each tribe pair. If at least four of the beliefs of the tribe in the pair differ more than a particular threshold, we consider the tribes to be hostile. All of these hostilities are then normalized across the entire set of possible pairs, providing an overall hostility indicator.

To support the propagation of beliefs and the corresponding reduction of hostility, our model contains two different interventions: one where tribal or national leaders give speeches to their particular regions and one where leaders meet in conference to increase their own knowledge, approximating the Tamazuj forums which occurred in Sudan before the referendum (ICG Sudan 2010). To "win" the model, a sequence of interventions needs to be chosen that will successfully reduce hostility below a particular threshold. Our theoretical minimum is 0.2, but this is not based on a fully play-tested implementation of the game. While it may seem that by incorporating only two forms of intervention our model keeps the complexity limited, this is not the case. Given that our model incorporates 14 tribal leaders and six national leaders, eight beliefs that can be construed either positively or negatively, and the ability of leaders to meet in groups of two, three, or four, there are:


possible interventions at any time period. This number grows exponentially over time, making complete computation impossible.

Successful solutions to the Sudan game can be considered high-level policy solutions for Sudan - an ordered mixture of conferences and interventions with specific tribes. While one player's successful game should not be a sole determiner of policy, a winning combination of choices that is validated with multiple simulation runs outside of the game could be used to make a recommendation about an appropriate course of action. Players thus serve as individual policy analysts, their opinions and choices informed by the information that is included within the scope of the simulation and the game that is wrapped around it.

The Tribes game is a graphic interface to this model. A player takes on a role as a member of the UN, talking to the different tribal leaders mentioned above, and choosing interventions based on their understanding of the current situation in Sudan. This understanding is cultivated in several ways. First, all leaders provide a modicum of contextual text explaining their relationship to their tribe or the nation as a whole. The player can also ask the leaders about the precise effects of their interventions on Sudan. Secondly, and more importantly, the player can access visual information about the beliefs of the different Sudanese tribes, the beliefs about which a leader is informed (that is, about which they could give an effective speech), and the current hostility in the country.

Beyond tracking players' solutions and their relative effectiveness, we plan to look at players' dialogue in Cosmopolis relating to Tribes, as well as their interactions in the section of Cosmopolis's online forum dedicated to Tribes. By doing so, we hope to better understand players' motivations for why they make the choices that they do when playing. Players will be developing their solutions based on their understanding of Sudan as presented in the game. It is not essential that they develop a deeper understanding of the policies of the country -the game is based on the assumption that such an understanding should not be required of the players- but to not have any insight into their process is anathema to common sense. As such, we will log and analyze this conversation in order to better understand not simply players' decisions but also why they make them.

Because Tribes has not been launched, no formal results currently exist for it. However, we have tested out the model in simulation and as such have some understanding of the game's difficulty. To prepare, we have carried out two alternate "greedy" test simulations of the game. By "greedy", we mean that in analyzing the simulation we commit to the best choice for a particular time period and then determine the best choice for the next time period, thus building a complete set of best choices. We do not, however, limit our determination of the best possible option by only looking at the current time period. Rather, we look at the long-term consequences of an intervention (as if no future interventions occurred) and thus choose interventions based on net performance over time. If no intervention can be considered the best, we choose randomly between them.

A downside of the greedy approach is that it makes it impossible to study the impact of conferences; a conference between leaders will have no immediate positive impact on the beliefs of the Sudanese. To compensate for this, we have run two alternate sets of simulations, one in which all of the leaders have their natural knowledge, and one in which all leaders have perfect knowledge - the maximum possible outcome from a set of conferences. These two set of simulations thus provide bounds for the problem space - the maximum and minimum impact that can be had by the best intervention at a particular time period.

Our results from these tests are shown below. They suggest several key elements that will need to be taken into account when actually releasing the game. The first of these is that the difficulty level of the game needs to be significantly refined. Our initial plan was to ask the player to reduce hostility to a level of 0.2; this was not possible in even our optimal case, where the lowest hostility value seen was 0.46153885 and occurred because of the first intervention. This suggests that it may be impossible to dramatically alter the hostility levels in the country. Any model that is this difficult to manipulate needs to be reworked to be made amenable to gameplay. The player needs to think that they are making headway on a problem, not running into a brick wall.

Secondly, we need to develop other methods of probing the model. This initial map provides us with some boundaries for the space, but a better map could be made by using AI agents to make greedy decisions on the fly as opposed to by locking in a particular solution at each time period. Our current method works best to find a single, optimal solution for each time period. Given that in both the imperfect and perfect knowledge cases we ran into a situation where more than 200 possible choices performed equally well at one time period, randomly choosing a particular path will not necessarily guarantee any long term measure of success. Looking at the output of a host of AI agents will certainly not yield a single best path, but will better bound the problem space.

Table 1: Impact of interventions in Tribes game

InterventionImperfect KnowledgePerfect Knowledge
Minimum Hostility Seen
# of Minima
Max Hostility SeenMinimum Hostility Seen
# of Minima
Max Hostility Seen

* TeamIt Experiment Synopsis

The overview of TeamIt (Chang and Maheswaran 2012, partially reprinted here for clarity) from the game perspective is that teams of players must work together to solve problems, and negotiate with other teams to achieve goals. Under different social and cultural contexts, this dynamic collaboration and negotiation can take many different forms. From a scientific viewpoint, data gathered from specific cultural groups within an immersive 3D environment can provide the needed insight to develop and tune HSCB models for these groups, which can then be used to drive training simulations. We also need to validate whether insights gained through game data is valid when mapped to situations in the real world. The game features a 3-D virtual USC created in Cosmopolis.


Figure 6. TeamIt teams in Cosmopolis and on the real USC campus

Meanwhile, a real-world counterpart to the Cosmopolis subgame was set up around the physical USC campus. Figure 6 shows a side-by-side comparison. The researchers recruited 30-50 subjects for both the Cosmopolis and real-world components of the experiment via in-game notices, and by emails and posters at USC. In both scenarios, each player on the competing teams begins the game with a discrete set of skills. There were various locations with "challenges" set up around campus. Each challenge requires a certain set of skills to complete, often requiring players from both teams to cooperate.

TeamIt investigated the differences between game play of the same location-base game in a real and virtual environment. Two teams of five players competed with each other in the same scenario. The winning team for each game, i.e., the one with the higher score for that game, would receive a prize of movie tickets. The games were run for 30 minutes each, after some very basic introduction on game mechanics in each environment. Due to learning effects and obviously small sample size, we concentrate on the strong within-group differences exhibited in the data. First, we discuss the results of the mapping experiments between the real and the virtual versions of the game. The hypothesis that the activity levels in the virtual world would likely be higher is borne out. The number of tasks completed is much higher for both teams, as is the corresponding score. The distance traveled is also higher in the virtual game vs. the real world game. By observing the players during the game, we are also able to suggest qualitative explanations for these differences. Players in the virtual Cosmopolis game clearly took advantage of the "run" action much more often than players in the real world actually ran; players in the real world actually became tired and walked more frequently. Perhaps an even bigger difference is due to the physical limitations of the real world infrastructure. In Cosmopolis, all game communications worked flawlessly, in terms of recording task completion between the desktop clients and the central game server. In the real world, the iPad game clients required wireless communication with the game server, and it was often spotty. This caused players in the real world to spend much more time completing tasks, because tasks can only be completed when there is a connection to the server, and they often contended with poor wireless signals.

The TeamIt experiment demonstrates the difficulty in designing virtual world games and training simulators that can accurately mimic real world situations. Player behavior is markedly different in the virtual version of the game.

This paired virtual and real-world game framework promises to enable many more future investigations into the mapping problem. TeamIt was an engaging challenge for explorer and achiever type players, as well as a stimulating social interaction framework. Now that the prototype USC experiment has been run, we plan to run TeamIt -based experiments in other cultural paradigms to explore how differences in culture and reward structure change the dynamics of team cooperation and competition.

* Challenges and Future Work

Cosmopolis has developed multiple base functions for the running of HSCB experiments. However, we have also identified several important challenges to building and deploying the Cosmopolis framework, which to greater or lesser degree are due to the experimental purpose of the MMOG.
  1. The mapping issue between real and virtual worlds, as described in the WarPipe and TeamIt sections above.
  2. For any subgame of Cosmopolis to work as designed, it will need to balance between being an engaging game and a scientifically viable experimental testbed. This became an issue with the Tribes experiment.
  3. In its current incarnation the Cosmopolis MMOG is very dependent on its core team of engineers, designers, and architects to create, run, and facilitate the experiments that use its game world and infrastructure.

The third challenge in particular leads us to believe that Cosmopolis should be made available and marketed to social scientists as a generalized and flexible framework and toolkit for the creation of specific experimental games, with many researcher-customizable parameters. The initial goal for social scientists would be to work with the USC design team to specify the following aspects of a proposed experiment:
  1. Class of theory or problem to be addressed. Problems of interest should involve, but are not limited to, multiplayer game interactions and/or game-based experiments based in a virtual 3-D world, as opposed to general social science experiments.
  2. Model of data and its derivation.
  3. Computational model to underlie gameplay.
  4. Customization level necessary for HSCB researchers to run, analyze, and evaluate their own experiments without further intervention by Cosmopolis core team.

Conception-to-publication running of experiments would be a joint project between USC and HSCB scientists, featuring researcher-customizable specifications of the generalized template. Some of these experiment concepts, as proposed by computational HSCB researchers, are outlined below. These experiments may employ a public and/or private base of users, and would be facilitated on a contractual basis with USC.

There are many identified HSCB-related research problems to be explored in the Cosmopolis framework, and many others as yet to be specified by social scientists and other investigators.

* Conclusion

MMOGs are widespread and popular online communities; World of Warcraft alone boasts millions of player characters. The significance of Cosmopolis is its uniqueness as an MMOG designed specifically as a research testbed for social and behavioral models, with a correspondingly high degree of researcher control over experiments performed in and data gleaned from the game world. A few of the key features that Cosmopolis incorporates are a multi-genre system of subgames, a dynamically modifiable outer world, and a channel-based information system featuring real-world feeds and game-world effects. While these features help to make the game novel and engaging, they also have specific applications for scientists opting to use Cosmopolis as a research platform: subgames are a way for researchers to conduct isolated experiments; the modifiable nature of the game world allows for events to occur that may dramatically alter the main game environment, providing fodder for scientists interested in the evolution of online communities; and the information broadcasting systems will allow different messages to be broadcast to different portions of the community to help manage experiments conducted on the entire player community. Also, any and all Cosmopolis actions and interactions (including the internal processes of AI-based non-player characters) may be logged into our databases, and may be used to explore the mappings between game world and real world societies. Ready access to a high-fidelity data set means that researchers will have an easier time determining the impacts of different experiments on the community in Cosmopolis than do researchers of more closed gaming environments. It is impossible for one MMOG to be considered the definitive online game, and Cosmopolis is not intended to be that. But it is an important step in opening up game environments for use by researchers, and one that can help support the work of scientists interested in studying game environments and how different social and behavioral phenomena manifest within them. We hope that the public presence of Cosmopolis will encourage other researchers to look to our game environment as an avenue for research into human behavior.

* Appendix

A: WarPipe Communications Script for Experimental Subjects

  • The map will contain XX OpFor and YY MacGuffins.
  • No respawns will be allowed for either players or OpFor, though players should have more health/damage resistance than OpFor.
  • The OpFor will be played by members of the Warpipe design team. OpFor should attempt to kill players, but not "with prejudice" (e.g. one shot kills) and not by actively seeking out trouble. Rather, they should confine their movements to the areas near where they have been spawned, as if they are attempting to maintain a defensive location. They may withdraw after confrontation to some other nearby area, but should not try to pursue the players
  • For reference, see the Every Soldier Is A Sensor Training in America's Army 2, which is set on the MOUT McKenna map.
  • We need a map of the area
  • We need a distribution pattern for OpFor and <mcguffin>s across the map.

B: Pre-WarPipe Communications Experiment Survey

  1. What is your gender?
  2. What is your age?
  3. Have you played a first/third person shooter before? If "yes", please answer the sub-questions. If "no", skip to question 4.
    1. Approximately how often do you play them?
      • Less than 1 hour per month.
      • 1 - 4 hours per month.
      • 1 - 4 hours per week.
      • 5 - 8 hours per week.
      • 9 - 12 hours per week.
      • 13 - 15 hours per week.
      • More than 15 hours per month.
    2. Did you play them in the past significantly more intensely than you do now?
    3. Have you ever played a first/third person shooter online, with other players? If "yes", please answer the sub-questions. If "no", skip to question 4.
      1. How often do you spend playing first/third person shooters online?
        • Less than 1 hour per month.
        • 1 - 4 hours per month.
        • 1 - 4 hours per week.
        • 5 - 8 hours per week.
        • 9 - 12 hours per week.
        • 13 - 15 hours per week.
        • More than 15 hours per month.
      2. Consider the four types of online first/third person shooters below.
        • Role-based teams with character respawning (Team Fortess 2, Day of Defeat, Battlefield 1942)
        • Generic teams with character respawning (Unreal Tournament, Halo)
        • Role-based teams with no character respawning (Rainbow Six)
        • Generic teams with no character respawning (Counter Strike)
        Based on this division, on a scale from "1 to 7", where 1 indicates "complete dislike", 3 "no preferences", and 7 "exclusively prefer", please rank your preferences for the different game types as follows:
        1. Role-based teams with character respawning over
        2. Generic teams with character respawning?
        3. Role-based teams with character respawning over
        4. Role-based teams with no character respawning?
        5. Role-based teams with character respawning over
        6. Generic teams with no character respawning?
        7. Generic teams with character respawning over
        8. Role-based teams with no character respawning?
        9. Generic teams with character respawning over
        10. Generic teams with no character respawning?
        11. Role-based teams with no character respawning over
        12. Generic teams with no character respawning?
      3. When do you use text chat to communicate with the other players, and why?
      4. When do you use voice chat to communicate with other players?
  4. Did you know any of the other people on your team?
    1. If yes, who do you know and how do you know them?
      1. On a scale from 1 to 7, where 1 is "never" and 7 is "every day", how often do you play first/third-person shooters with them?

C: Post-WarPipe Communications Experiment Survey

  1. How many OpFor were located on the map?
  2. How many <mcguffins> were located on the map?
  3. On a scale of 1 to 7, where 1 means never and 7 means constantly, approximately how often did you use the game's chat system?
  4. On a scale from 1 to 7, where 1 indicates that it worked against you and 7 indicates that it strongly helped your efforts, how useful were the game's constraints on inter-player communication?
  5. On a scale from 1 to 7, where 1 indicates that they were completely disorganized and 7 indicates that they were entirely organized, how organized were your team's efforts at finding the opposition forces?

D: Game Data To Be Used in WarPipe Communications Experiment

  • A recording of the game such that the entire thing can be replayed on an overhead map of the level. This might require:
    • A CSV file containing identified player location snapshots
    • A CSV file containing identified OpFor location snapshots
    • A CSV file containing <mcguffin> locations
    • A CSV file identifying when players kill other players
    • A CSV file identifying the weapons used by different players
    • A CSV map of the level with salient building locations demarcated
  • A communications transcript for each game
  • A CSV file containing: Speaker, type of communication, length of communication, time of communication. Communication by the "communicator" that acknowledges the finding of <mcguffin>s, the killing of OpFor, or that the players want to end the experiment should be indicated.
    • Types of communication: spoken, broadcast to <subset> of team, broadcast to Warpipe administration.

E: Dimensions of the Construct Model Underlying Tribes

The model includes 14 tribes made up of 390 constituent agents.

Southern TribesAgentsNorthern TribesAgents
Bor Dinka27
Gok Dinka27
Malual Dinka27
Ngok Dinka27

Each tribesperson can know up to five positive or five negative facts about eight central beliefs. (Complete knowledge of all facts would result in a neutral attitude towards a particular belief.)

Modeled Beliefs
Belief that Sudan should remain unified
Belief that Sharia law should be uniformly applied across the country
Belief in respecting Human Rights
Belief in respecting Women's Rights
Belief in the National Congress Party's Platform
Belief in the Sudan Peoples' Liberation Movement Platform
Belief that the different intertribal disputes can be resolved peacefully
Belief that the different regional border disputes can be resolved peacefully

At the start of the simulation, a given agent has an even chance of knowing a particular fact relating to any belief, but is predefined with a belief value drawn from a distribution that is more specifically tied to the tribe. The difference between the assigned belief value and the belief value derived from the facts is considered to be that tribe member's bias towards a particular belief. Stabilizing hostility values is the process of overcoming this bias with additional facts.

The model supports interventions by two different types of leaders: those at the tribal level and those at the national level. Every tribe has its own leader, who can only influence the tribe. The six national leaders, in contrast, have different amounts of influence on different tribes depending on the part of the country in which they are based.

National LeaderTie to Southern Tribes (0.0 - 1.0)Tie to Northern Tribes (0.0 - 1.0)
Omar Bashir
President of Sudan
Ali Osman Taha
Vice President of Sudan
Salva Kiir Mayardit
President of South Sudan
Hassan al-Turabi
Hard line Islamist
1.0 to 3% of agents, 0.0 to the rest 1.0 to 97% of tribes, 0.0 to the rest
High Level SPLM Leader (generic)Normal distr., µ=0.5, sd=1.0
(Value in [0.0, 1.0])
Normal distr., µ=-2.5, sd=0.75
(Value in [0.0, 1.0])
High Level NCP Leader (generic)Normal distr., µ=-2.5, sd=0.75
(Value in [0.0, 1.0])
Normal distr., µ=1.5, sd=0.75
(Value in [0.0, 1.0])

Besides giving speeches, tribal leaders can also meet in conferences to learn new facts. Initially, tribal leaders know five facts per belief, and national leaders have perfect knowledge of all facts. Two, three, or four leaders can attend a conference, and each leader will learn four, three, or two facts known by the other leaders that relate to a specified belief. The number of facts learned is thus contingent on the number of attendees.

Below is pseudocode that outlines the high-level process of the Tribes model.

def initializeTribes() {
        for (tribe in tribes) {
                for (tribeMember in size(tribe) {

def carryOutIntervention(
        bool firstRun,
        bool speechInterventionByLeader,
        bool meetingBetweenTribalLeaders,
        list leaderList) {
        if (firstRun) { initializeTribes(); }
        else { loadTribeDataFromOldOutputs(); }

        if (speechInterventionByLeader) {
                for (timePeriod in runDuration) {
                        if (timePeriod < interventionDurationForLeader(leaderList[0]) {
        else if (meetingBetweenTribalLeaders) {
                for (timePeriod in runDuration) {


F: Links to Cosmopolis Videos

World Building Tools
Multiplayer Beta (Summer 2010)
Teaser (Demo Day 2010)
Gameplay (Demo Day 2010)
Annotated Gameplay
United Nations Millennium Challenge

* References

ASCHBACHER, P. (2003). Gender Differences in the Perceptions and Use of an Informal Science Learning Web Site. National Science Foundation, Arlington, VA.

BARTLE, R.A. (1996). Players Who Suit MUDs. http://www.mud.co.uk/richard/hcds.htm.

BAVELAS, A. (1950). Communication Patterns in Task-Oriented Groups. The Journal of the Acoustical Society of America, vol. 22, pp. 725-730. [doi:10.1121/1.1906679]

CARLEY, K.C. (2010). Answers on Sudan, Pittsburgh, PA: Carnegie Mellon University.

CARLEY, K., Moon, I., Schneider, M., and Shigiltchoff, O. (2005). Detailed Analysis of Factors Affecting Team Success and Failure in the America's Army Game. Pittsburgh, PA: Carnegie Mellon University.

CASTRONOVA, E. (2005). Synthetic Worlds. University of Chicago.

CASTRONOVA, E. (2008). A Test of the Law of Demand in a Virtual World: Exploring the Petri Dish Approach to Social Science. Indiana University.

CHANG, Y., and Maheswaran, R. (2012). Team-It: Location-Based Gaming in Real and Virtual Environments. Proceedings of AIIDE-2012 (in press).

CHRIST, R.E., and Evans, K.L. (2002). Radio Communications and Situation Awareness of Infantry Squads During Urban Operations. United States Army Research Institute.

CLAYPOOL, M., Claypool, K., and Damaa, F. (2006). The Effect of Frame Rate and Resolution on Users Playing First Person Shooter Games. Proceedings of SPIE-2006.

DUCHENEAUT, N., Yee, N., Nickell, E., and Moore, R.J. (2006). "Alone Together?": Exploring the Social Dynamics of Massively Multiplayer Online Games. Proceedings of the SIGCHI-06 Conference on Human Factors in Computing Systems (ACM), 407-416. [doi:10.1145/1124772.1124834]

FARMER, F.R. and Morningstar, C. (1990). The Lessons of Lucasfilm's Habitat. Cyberspace: First Steps. MIT Press, Cambridge, MA, ed. Benedikt, M.

GIBBS, M., Wadley, G., and Benda, P. (2006). Proximity-Based Chat in a First Person Shooter: Using a Novel Voice Communication System for Online Play. Proceedings of the 3rd Australasian Conference on Interactive entertainment, Perth, Australia: Murdoch University, pp. 96-102.

HUANG, Y., Shen, C., and Contractor, N. (2009). Virtually There: Exploring Proximity and Homophily in a Virtual World. Proceedings of IEEE SocialCom-09. [doi:10.1109/cse.2009.471]

HUFFAKER, D., Wang, J., Treem, J., et al. (2009). The Social Behaviors of Experts in Massive Multiplayer Online Role-playing Games. Proceedings of IEEE SocialCom-09. [doi:10.1109/cse.2009.13]

INTERNATIONAL CRISIS GROUP SUDAN. (2007). Breaking The Abyei Deadlock. ICG Report: Juba / Khartoum / Nairobi / Brussels. International Crisis Group, 2007.

INTERNATIONAL CRISIS GROUP SUDAN. (2010). Defining the North-South Border. ICG Report: Juba / Khartoum / Nairobi / Brussels.

JOHNSON, D.H. (2006). The Root Causes of Sudan's Civil Wars. 3rd ed. Bloomington, Indiana: Indiana University Press.

JOHNSON, N.F., Xu, C., Zhenyuan, Z., et al. (2009). Human Group Formation in Online Guilds and Offline Gangs Driven by a Common Team Dynamic. Physical Review E 79, 6, 1-11. [doi:10.1103/physreve.79.066117]

KOSTER, R. (2010). The Laws of Online World Design. Raph Koster's Home Page. http://www.raphkoster.com/gaming/laws.shtml.

MOON, I., Carley, K., Schneider, M., and Shigiltchoff, O. (2005). Detailed Analysis of Team Movement and Communication in the Americas Army Game. Pittsburgh, PA: Carnegie Mellon University.

NATIONAL RESEARCH COUNCIL. (2008). Behavioral Modeling and Simulation: from Individuals to Societies. Committee on Human Factors, Division of Behavioral and Social Sciences and Education, National Research Council, National Academies Press, Washington, DC, ISBN 0-309-11862-X. [doi:10.1109/ISTAS.1996.541167 ]

PEARCE, C. and Artemesia. (2009). Communities of Play: Emergent Cultures in Multiplayer Games and Virtual Worlds. MIT Press.

PESCHON, J., Isaksen, L., and Tyler, B. (1996). The Growth, Accretion, and Decay of Cities. Proceedings of 1996 International Symposium on Technology and Society Technical Expertise and Public Decisions, pp. 301-310. [doi:10.1109/ISTAS.1996.541167]

REDDEN, E.S., and Blackwell, C.L. (2001). Situational Awareness and Communication Experiment for Military Operations in Urban Terrain: Experiment 1. Aberdeen Proving Ground, MD: Army Research Laboratory.

SCHNEIDER, M., Carley, K., and Moon, I. (2005). Detailed Comparison of America's Army Game and Unit of Action Experiments. Pittsburgh, PA: Carnegie Mellon University.

SCHREIBER, C., Singh, S., and Carley, K.. (2004). Construct - A Multiagent Network Model for the Co-evolution of Agents and Sociocultural Environments. Pittsburgh, PA: CASOS, Carnegie Mellon University.

SLATER, M., and Sadagic, A. (2000). Small-Group Behavior in a Virtual and Real Environment: A Comparative Study. Presence, vol. 9, Feb. 2000, pp. 38-51. [doi:10.1162/105474600566600]

VAN-HOLT, T. and Johnson, J.C. (2011). A Text and Network Analysis of Natural Resource Conflict in Sudan. Proceedings of Sunbelt XXXI, St. Petersburg Beach, Florida, USA.

WILLIAMS, D. (2009). The Mapping Principle, and a Research Framework for Virtual Worlds. University of Southern California, Los Angeles, CA.

WILLIAMS, D., Ducheneaut, N., Xiong, L., Zhang, Y., Yee, N., and Nickell, E. (2006). From Tree House to Barracks: The Social Life of Guilds in World of Warcraft. Games and Culture 1, 4, 338-361. [doi:10.1177/1555412006292616]

WRIGHT, T., Boria, E., Breidenbach, P. (2002). Creative Player Actions in FPS Online Video Games. Game Studies, vol. 2, Dec. 2002.

YEE, N. (2006). Motivations for Play in Online Games. CyberPsychology & Behavior 9, 6, 772-775. [doi:10.1089/cpb.2006.9.772]

YEE, N. and Bailenson, J. (2007). The Proteus Effect: Implications of Transformed Digital Self-Representation on Online and Offline Behavior. Human Communication Research 33, 3, 271-290. [doi:10.1111/j.1468-2958.2007.00299.x]